We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results

2025 Summer Data Engineering Intern - Prescient Design

Genentech
United States, New York, New York
149 5th Avenue (Show on map)
Dec 19, 2024
The Position

Department Summary:

The Large Language Model (LLM) team within Prescient Design is seeking undergraduate interns with experience in software and data engineering, as well as a keen interest in LLMs and biomedical data.

One of Prescient's new, innovative efforts is in developing state-of-the-art large LLMs for scientific discovery and biomedical applications. We envisage LLMs for use across the drug discovery and development pipeline, including applications like scientific document classification to conversational models and multimodal learning of complex data types including biological sequences and high-resolution microscopy images.

As part of your internship, you'll work on projects surrounding building out LLM data infrastructure, improving data accessibility by contributing to our information retrieval platform, and large-scale data processing.

This internship position is located in New York City, on-site.

Key Responsibilities:

  • Develop systems to improve data infrastructure, with a focus on data quality and reliability.

  • Design and implement data pipelines that efficiently process, transform, and consolidate large-scale data from multiple sources.

  • Create user-friendly interfaces and tools that enable researchers to easily query, discover, and interact with complex data repositories.

Program Highlights

  • Intensive 12-weeks, full-time (40 hours per week) paid internship.

  • Program start dates are in May/June (Summer).

  • A stipend, based on location, will be provided to help alleviate costs associated with the internship.

  • Ownership of challenging and impactful business-critical projects.

  • Work with some of the most talented people in the biotechnology industry.

Who You Are

Required Education: Must be pursuing a Bachelor's Degree (enrolled student).

Required Majors: Computer Science, Engineering, or related fields

Skills and qualifications:

  • Strong programming skills- proficiency in Python; ability to write, optimize, debug, and test production-ready code, and previous software engineering work experience

  • Familiarity with data integration, ETL processes, and large-scale data processing techniques.

  • Experience with or strong interest in LLMs, information retrieval, and data engineering is a huge plus!

Relocation benefits are not available for this job posting.

The expected salary range for this position based on the primary location of New York City is $45 hourly. Actual pay will be determined based on experience, qualifications, geographic location, and other job-related factors permitted by law. This position also qualifies for paid holiday time off benefits.

#GNE-R&D-Interns-2025

Genentech is an equal opportunity employer, and we embrace the increasingly diverse world around us. Genentech prohibits unlawful discrimination based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin or ancestry, age, disability, marital status and veteran status.

Applied = 0

(web-788bdb5dbc-slfwk)