Day 2 - HUZZAH
We continued on to actually accessing the data and validating that it is indeed queryable. Because if it wasn’t, that would suck.
I added an validation step that queried the BLS API for each constructed series ID and inspected the response before ingestion.
No transformations yet, just wanted to make sure I could capture it as given with a focus on the fields I actually need.
Today’s observations;
- •Some series did not exist in the format I assumed (Very helpful readme docs on the gov site clarified this)
- •The API will happily succeed while still giving you empty data. (Thanks for nothing)
Let’s define them.
- •Job Openings and Labor Turnover Survey (JOLTS) data is the literal movement of the labor market (hires, quits, separations) aka the churn of the labor market.
- •Series - the numerical sequences assigned to variables (the hires, quits, separations) for each industry.
- •American Community Survey (ACS) data is census data that reflects the market broken down into a demographic container of population, labor force size, employment counts
Therefore, where do we stand now ?
- •JOLTS national rates ingested, validated, and stored cleanly (2022–2024)
- •ACS national context ingested separately, validated and stored cleanly (2022- 2024)
- •Raw tables to preserve original resolution and document “missingness”
[BLS API] [ACS API]
↓
Raw JOLTS national rates & Raw ACS detail (DuckDB)
↓
Validated via time series (2022–2024)
↓
Ready for analysis
Okay. Time for hot chocolate & bed. Good night.