Astronomers gain access to massive early-universe dataset
The Hobby-Eberly Telescope Dark Energy Experiment (HETDEX) has released over half a petabyte of data mapping more than one million distant galaxies. The dataset contains 600 million spectra collected between 2017 and 2024, offering researchers raw material to study how galaxies formed when the universe was between 1.8 and 3.2 billion years old.
The release, published June 3 in the Astrophysical Journal, makes the data freely available to scientists, students, and citizen researchers. The database includes catalogs of 500,000 nearby star-forming galaxies, 18,000 supermassive black holes, and over 150,000 stars.
What the data reveals
HETDEX used spectroscopy to break light into wavelengths, creating a spectrum for each observed point. This technique reveals an object's chemistry, temperature, mass, movement through space, and distance from Earth.
"This is a spectral map of the universe," said Erin Mentuch Cooper, HETDEX data manager and lead author of the paper. "It turns every point of light into a barcode of physics."
The survey observed a region of sky equivalent to about 2,000 full moons, centered on areas near the Big Dipper and Orion. The 431,000 data cubes map information in three dimensions, with each cube covering roughly one-thirtieth the size of the full moon.
Dark energy and cosmic expansion
HETDEX's primary goal is using the galaxy map to investigate the universe's expansion history and the nature of dark energy. Dark energy was discovered 30 years ago when observations showed the universe's expansion was accelerating, but its composition remains unknown.
"The new observations should place strong constraints on evolutionary models of the universe," said Donghui Jeong, professor of astronomy and astrophysics at Penn State.
AI and citizen science in data processing
The team processed the raw data down to 10 terabytes and developed tools to help both human and artificial intelligence users analyze it. Software automatically removed contamination from satellites and meteors, while automated methods identified early galaxies in the observations.
More than 24,000 citizen scientists confirmed the presence of galaxies through the Dark Energy Explorers program, running parallel to the automated analysis.
Access and tools
Researchers can download customized data subsets based on sky location. The collaboration with the University of Texas at Austin's Texas Advanced Computing Center provides access to cloud-based supercomputing resources, lowering barriers for working with datasets at this scale.
The team created extensive tutorials and tools alongside the release. "We've turned more than half a billion spectra into something you can actually explore," Mentuch Cooper said.
For researchers looking to build skills in analyzing large astronomical datasets, AI Data Analysis Courses and AI Research Courses offer foundational training in the methods used to process and extract insights from complex scientific data.
To access the HETDEX database, visit the HETDEX project website.
Your membership also unlocks: