Guided story

What India's AQI Can See, And What It Can't

AQI is a score from a monitored place, not a map of the air everyone breathes. India's public raw archive is real and deeper than the dashboard suggests, but the inspectable PM2.5 history is late, uneven and city-heavy.

AQI is a score, not the air

India's AQI is the number most people see first. CPCB's official National AQI material describes it as a score built from pollutant sub-indices. In plain English, each pollutant gets converted into its own index score, then the highest available sub-index becomes the overall AQI. The pollutant behind that highest score is reported as the . CPCB's public categories run from Good and Satisfactory to Moderate, Poor, Very Poor and Severe.

That is useful design. A single score is easier to read than a table of PM2.5, PM10, nitrogen dioxide, sulphur dioxide, carbon monoxide, ozone and other pollutants. But it is also a compression. AQI tells you the public signal from a monitor at a place and time. It does not show every pollutant cell, every missing instrument, every station's history, or whether a public file has enough PM2.5 days for trend work.

This article starts underneath the score. The daily repository audit checked 598 CPCB station records. In 2009, CPCB listed 25 daily files in the audited archive, but only 1 had usable PM2.5. In 2017, there were 504 listed station-year files, but only 80 had usable PM2.5. By 2025, the same audit found 579 listed files and 540 usable PM2.5 station-years. The lesson is simple: files, rows and pollutant cells are different layers of public evidence.

Chart 2

Listed files and usable PM2.5 history diverge

CPCB daily raw repository · listed files, usable PM2.5 years and files with no PM2.5 · 2009-2025

station-years
579

Repository files listed · 2025 · latest point

0200400600201020152020202557954039.0thisindianlife.today0200400600200920152020202557954039.0thisindianlife.today
Repository files listedUsable PM2.5 station-yearsListed files with no PM2.5

The daily repository grows to 579 listed station-year files by 2025, but the usable PM2.5 history catches up much later.

This is the article's core chart because it separates three layers: a listed daily file, a station-year with any usable PM2.5, and a listed file with zero PM2.5. The odd 2017 jump is real, but it is a repository-listing finding: CPCB lists 504 daily station-year files, only 80 have any usable PM2.5, and 424 have none. By 2025 the gap is much smaller, with 579 listed files and 540 usable PM2.5 station-years.

How to readRead each year vertically. The red line is listed station-year file paths. The dark line is station-years with at least one non-missing PM2.5 daily value. The green line is listed files where PM2.5 is still zero, so a wide red-dark gap means the repository looks deeper than the PM2.5 evidence inside it.

Watch outDo not read the red line as operating stations, city coverage or pollution exposure. It is only the count of station-year file paths returned by the repository.

What raw data can the public download?

CPCB's public raw repository is broader than a daily CSV archive. On 29 June 2026, a direct probe checked 598 station records against the public dataRepository/file-path endpoint for raw data at 15-minute, hourly, 8-hour and daily frequencies. It returned 4,346 listed 15-minute raw files, 4,367 hourly files, 4,396 8-hour files and 4,455 daily files.

That is a real public surface. It matters because the official system is not only an AQI card on a dashboard. It also exposes file paths that can be reproduced by script. But a listed path is still only a listing. It says CPCB has a raw file path for a station, frequency and year. It does not prove that the CSV has rows. It does not prove PM2.5 or PM10 is present. It does not prove the year is complete enough for a trend.

A narrow spot-check confirmed that sub-daily files are not imaginary. For Sanjay Palace, Agra in 2025, the same public download endpoint returned parseable CSVs at all four frequencies: 35,040 rows at 15-minute frequency, 8,760 hourly rows, 1,095 8-hour rows and 365 daily rows. Each file had PM2.5 and PM10 columns with many non-missing values. Treat that as a proof of access, not a completeness audit for every sub-daily file.

The AQI repository is a different case. The CPCB interface has a /repository/aqi route and app-code strings for AQI repository listing and Excel export. In this run, I could not reproduce a stable public AQI audit endpoint comparable to the raw file-path call. Several dashboard routes used an encrypted browser proxy. So the raw repository is treated as scriptably verified here. The AQI repository is treated as visible in the official interface, but not used for locked historical numbers.

Chart 3

CPCB's raw archive extends beyond daily files

Public raw repository probe · file listings at 15-minute, hourly, 8-hour and daily frequency · 598 station records

listed files
15-minute raw files
4,346
Hourly raw files
4,367
8-hour raw files
4,396
Daily raw files
4,455

CPCB's verified raw endpoint lists thousands of files at 15-minute, hourly, 8-hour and daily frequency.

The file-list probe returned 4,346 listed 15-minute raw files, 4,367 hourly files, 4,396 8-hour files and 4,455 daily files across 598 station records. A separate spot-check for Sanjay Palace, Agra in 2025 confirmed that sub-daily files can be downloaded and parsed, but this chart itself counts listings only. It does not validate row completeness or pollutant cells inside every sub-daily file.

How to readEach bar counts file paths returned by CPCB for one raw frequency. The similar bar lengths mean the repository exposes a comparable number of station-year files across frequencies, not that the files are equally complete.

Watch outDo not assume listed 15-minute, hourly or 8-hour files have the same completeness as the daily files audited in detail.

How much PM evidence is actually visible?

The deeper audit downloaded and checked 4,455 daily raw CSVs. Across those files, it found 916,083 non-missing PM2.5 station-days and 872,770 non-missing PM10 station-days. A is one station with one non-missing pollutant value on one date. It is a coverage unit, not a pollution concentration.

The 2017 spike in listed files is the chart's warning label. CPCB lists 504 daily station-year files for 2017, up from 50 in 2016. But only 80 of the 2017 files have any usable PM2.5, and 424 have zero usable PM2.5. Many are CSV shells with daily timestamp rows and blank particulate columns. They are public files, but not public PM2.5 evidence.

The inspectable particle record becomes much stronger after 2020. PM2.5 station-days rise from 15,384 in 2017 to 74,169 in 2020, 153,321 in 2023 and 173,983 in 2025. PM10 follows the same broad pattern in the audited daily files. Do not read that as worsening air. It means more public observations are visible in the repository.

Chart 4

Usable PM observations become broad only after 2020

Downloaded CPCB daily files · non-missing PM2.5 and PM10 station-days · 2009-2025

station-days
1,73,983

PM2.5 station-days · 2025 · latest point

050,0001,00,0001,50,0002,00,00020102015202020251,73,9831,73,787thisindianlife.today050,0001,00,0001,50,0002,00,00020092015202020251,73,9831,73,787thisindianlife.today
PM2.5 station-daysPM10 station-days

The audited daily files contain 916,083 non-missing PM2.5 station-days and 872,770 PM10 station-days, with most depth arriving after 2020.

A station-day is one station with one non-missing pollutant value on one date. PM2.5 station-days rise from 15,384 in 2017 to 74,169 in 2020 and 173,983 in 2025. That is a rise in inspectable observations, not a rise in pollution concentration. The same logic applies to PM10, which is plotted beside PM2.5 as a second particle-coverage check.

How to readFollow the lines over time as counts of non-missing daily PM values. PM2.5 and PM10 move together in recent years, which shows that the public particle record becomes much more usable in the 2020s.

Watch outDo not say the line rising means India's air worsened. A coverage line rises when more station-days are visible, even if concentration levels move differently.

When is a station-year ready for trend work?

A station-year with one PM2.5 day is visible. It is not a serious yearly record. That is why this audit counts four thresholds: any PM2.5 day, 30 or more days, 180 or more days and 300 or more days.

The distinction is large in the older archive. In 2017, 80 station-years had any PM2.5 day, 76 had at least 30 days, 42 had at least 180 days and only 23 had at least 300 days. By 2025, the four counts were 540, 536, 490 and 444. The gap narrows, but it does not disappear.

The 300-day line is not an official CPCB standard. It is an audit threshold for readers who want a fuller annual history. The point is practical. A dashboard can show a station. A file list can show a year. A trend analyst still has to ask how many actual pollutant days sit inside that station-year.

Chart 5

Trend-ready PM2.5 years lag basic visibility

Downloaded CPCB daily files · station-years by minimum count of non-missing PM2.5 days · 2009-2025

station-years
540

Any PM2.5 day · 2025 · latest point

02004006002010201520202025540536490444thisindianlife.today02004006002009201520202025540536490444thisindianlife.today
Any PM2.5 day30+ PM2.5 days180+ PM2.5 days300+ PM2.5 days

In 2025, 540 PM2.5 station-years had any data, but only 444 had at least 300 non-missing days.

This chart separates minimal visibility from fuller annual evidence. In 2017, 80 station-years had any PM2.5, but only 23 reached 300 days. By 2025 the gap is smaller, but it still tells analysts not to treat any-data counts as ready-made trend panels.

How to readEach line raises the minimum bar. The top line counts station-years with any PM2.5 day, while the lower lines require 30, 180 or 300 non-missing days. The spread between lines is the difference between seeing a station-year once and having a fuller annual record.

Watch outDo not treat 300 days as an official CPCB quality rule. It is an audit threshold chosen to make trend-readiness visible.

Where does the station map go thin?

The station network measures monitored India. It does not measure all of urban India. The CPCB/OAQ station hierarchy used here has 335 distinct city names. Census 2011 counts 4,041 statutory towns. On that denominator, about 91.7 percent, or about 92 percent, of statutory towns are outside this station-list comparison.

This is a place-visibility denominator. It is not a population-exposure estimate. It does not mean 92 percent of Indians lack AQI data. It also does not mean every needs its own monitor. Some towns are small, some are near larger airsheds, and station-list names do not map perfectly to Census names.

The narrower finding is still important. AQI is a monitor-based public signal. Before we even ask whether a file has PM2.5 cells, many legally urban places are outside the public station-list frame used in this audit.

Chart 6

The station list covers only a fraction of urban places

CPCB/OAQ city names compared with Census 2011 statutory towns

towns/city names
Statutory towns in Census 2011
4,041
City names in CPCB/OAQ station list
335
Statutory towns not visible in this station list
3,706

The CPCB/OAQ hierarchy has 335 city names, while Census 2011 counts 4,041 statutory towns.

This chart is a geography warning, not a monitor-siting prescription. It says about 92 percent of statutory towns are outside this station-list comparison, but it does not say 92 percent of Indians lack AQI data. It also does not say every statutory town needs a monitor, because airsheds and settlement size matter.

How to readCompare the smaller station-list city-name bar with the much larger Census statutory-town bar. The gap shows the difference between named places visible in this public station hierarchy and the legal urban frame.

Watch outDo not convert this into a population claim or a one-monitor-per-town rule.

Which cities have the deepest PM2.5 archive?

The usable PM2.5 archive is city-heavy. New Delhi has 73,348 PM2.5 station-days in this audit across 25 stations with any usable PM2.5. Mumbai has 41,502 across 28 such stations. Delhi has 39,328, Bengaluru 25,311 and Hyderabad 22,663.

That is not an air-quality ranking. A city can rank high because it has more stations, older stations, fewer missing PM2.5 cells, or all three. A lower-ranked city may be cleaner, dirtier or simply less visible in this public archive.

For national claims, this concentration matters. The public historical archive is strongest where the monitoring network is dense and older. It is weakest where stations arrived late, publish thin histories, or never appear in the public station list used here.

Chart 7

Deep PM2.5 history is concentrated in a few cities

Downloaded CPCB daily files · non-missing PM2.5 station-days by city · 2009-2025

PM2.5 station-days
New Delhi
73,348
Mumbai
41,502
Delhi
39,328
Bengaluru
25,311
Hyderabad
22,663
Kolkata
16,714
Ahmedabad
16,404
Lucknow
14,792
Patna
13,799
Noida
13,253
Gurugram
12,805
Chennai
12,387
Jaipur
11,938
Faridabad
10,843

New Delhi, Mumbai and Delhi dominate the usable PM2.5 station-day archive.

New Delhi has 73,348 usable PM2.5 station-days, Mumbai has 41,502 and Delhi has 39,328 in the audited daily files. These counts combine station density, station age and missing-cell patterns. They tell us where the public historical archive is deepest, not where air is cleanest or dirtiest.

How to readBars rank cities by total non-missing daily PM2.5 station-days. A city can rank high because it has many stations, long-running stations, better PM2.5 completeness, or some mix of those factors.

Watch outDo not compare these bars as pollution levels. They measure public evidence volume.

When did stations first become usable?

First usable PM2.5 year is stricter than first listed file year. It asks when a station first has any non-missing PM2.5 value in the daily repository files.

Only 200 of the 598 station records had usable PM2.5 before 2020. The recent record is much stronger: 508 station-years had usable PM2.5 in 2023, 524 in 2024 and 540 in 2025. Across the full daily audit, 396 stations had at least 1,000 usable PM2.5 days.

That makes the archive good for many current comparisons and weaker for long historical panels. A station appearing in an older repository year does not automatically give you a PM2.5 trend from that year. Sometimes the file exists, but the particle cells do not.

Chart 8

Most station PM2.5 histories begin late

Downloaded CPCB daily files · first year with any non-missing PM2.5 value, by station

stations
2009
1.0
2011
4.0
2012
1.0
2013
2.0
2015
22.0
2016
14.0
2017
40.0
2018
50.0
2019
66.0
2020
52.0
2021
65.0
2022
74.0
2023
131
2024
21.0
2025
15.0

Only 200 of 598 station records had usable PM2.5 before 2020.

First usable year means the first year with any non-missing PM2.5 daily value in the downloaded repository files. It is not the station's commissioning date and not necessarily the first year the station operated. The chart shows why recent PM2.5 comparisons are much easier than long historical panels.

How to readEach bar counts stations by the first year in which their daily files contain usable PM2.5. A pile-up in recent years means many stations become inspectable for PM2.5 late in the archive.

Watch outDo not assume a late first usable year always means a new station. Earlier files may exist but have missing PM2.5 cells.

Where do file years outrun PM2.5 years?

Some station histories look long until you count usable PM2.5 years. Airoli in Navi Mumbai has 17 listed daily years but only 5 usable PM2.5 years, a gap of 12. City Railway Station and Sanegurava Halli in Bengaluru have 11 listed years and zero usable PM2.5 years in the audited daily files.

Across all 598 station records, 40 had no usable PM2.5 in daily files. That splits into 22 records with listed daily files but no usable PM2.5, and 18 records with no listed daily repository file in the audit.

This does not prove a station failed. It says the public daily repository does not support a PM2.5 history for those station-years. For a public audit, that is the difference that matters.

Chart 9

Long file histories can hide short PM2.5 records

Downloaded CPCB daily files · listed file years minus years with any usable PM2.5, by station

listed years without PM2.5
Airoli, Navi Mumbai - MPCB
12.0
City Railway Station, Bengaluru - KSPCB
11.0
Sanegurava Halli, Bengaluru - KSPCB
11.0
Meelavittan, Thoothukudi - TNPCB
9.0
Sector- 16A, Faridabad - HSPCB
7.0
Sirifort, Delhi - CPCB
7.0
BWSSB Kadabesanahalli, Bengaluru - CPCB
7.0
Municipal Corporation Office, Tirunelveli - TNPCB
7.0
Parisutham Nagar, Thanjavur - TNPCB
7.0
Velippalayam, Nagapattinam - TNPCB
7.0
Uchapatti, Madurai - TNPCB
7.0
Kamadenu Nagar, Karur - TNPCB
7.0
VOC Nagar_SIPCOT, Ranipet - TNPCB
7.0
Nehru Nagar, Kanpur - UPPCB
6.0

Some stations have long listed file histories but little or no usable PM2.5 history.

Airoli in Navi Mumbai has 17 listed daily years but only 5 usable PM2.5 years. City Railway Station and Sanegurava Halli in Bengaluru have 11 listed years and zero usable PM2.5 years in the audited daily files. These are concrete examples of the file-versus-pollutant gap.

How to readEach bar is listed daily years minus years with any non-missing PM2.5 for that station. A larger bar means more years where the public repository listed a daily file but did not provide usable PM2.5 history.

Watch outDo not use the gap as proof that a monitor failed. Use it as evidence that the public daily files do not support PM2.5 history for those years.

What does a file with rows but no PM look like?

Some files are not empty. They still do not help with PM2.5 or PM10 history. The examples in this chart have daily rows and zero usable PM2.5 and PM10 days.

Rows are a middle layer of evidence. A file can contain timestamps and gas pollutants such as NO, NO2, SO2, CO or ozone while the PM columns are blank or unusable. For AQI, different pollutants can matter on different days. For a particle-pollution history, PM2.5 and PM10 cells have to be present.

This is why the article keeps file listing, rows and pollutant cells separate. If those are collapsed into one word, the archive looks cleaner than it is.

Chart 10

Daily rows can exist while PM values are blank

Selected CPCB daily files with rows and zero usable PM2.5 or PM10 values

daily rows
DTU, Delhi, 2009
365
IHBAS, Dilshad Garden, Delhi, 2009
365
Alandur Bus Depot, Chennai, 2009
306
BTM Layout, Bengaluru, 2009
306
Manali, Chennai, 2009
245
Airoli, Navi Mumbai, 2009
184
Sector- 16A, Faridabad, 2010
122
Ardhali Bazar, Varanasi, 2010
61.0
Lalbagh, Lucknow, 2010
61.0

Some daily files contain rows but zero usable PM2.5 and PM10 days.

These representative examples span full-year and partial-year CSVs. A file can contain timestamps and non-PM pollutant columns while both particle columns are blank or unusable. That is why row counts are a middle layer of evidence, not the final answer for PM history.

How to readBars count daily rows in selected files where both PM2.5 and PM10 usable day counts are zero. The point is not which example is largest, but that non-empty files can still be useless for particle-history work.

Watch outDo not call a file usable for PM history just because it has timestamps or gas-pollutant values.

What does OAQ add?

OAQ is useful because it mirrors current CPCB data in a cleaner API shape. The inspected CPCB snapshot was generated on 23 June 2026 at 17:40 IST. It had 598 station records. At that moment, 203 records had at least one of the eight main pollutant fields, 139 had all eight, 185 had PM2.5 and 183 had PM10. The other 395 records had no main pollutant value in that snapshot.

That is latest visibility, not deep history. It tells you what the mirror saw at one moment. It does not replace the CPCB repository audit of daily historical files.

The station metadata snapshot also has useful public fields. All 598 records had coordinates, station type, a latest-seen timestamp and an agency or operator suffix in the station name. It did not expose clean commissioning or activation dates. The public can inspect where a station is and who appears to operate it. It cannot cleanly inspect every station's start date from this metadata alone.

Chart 11

The latest OAQ mirror is a snapshot, not history

OAQ CPCB latest snapshot · generated 23 Jun 2026 at 17:40 IST

station records
CPCB station records in latest OAQ snapshot
598
Records with any main pollutant at that moment
203
Records with all 8 main pollutants
139
Records with PM2.5 at that moment
185
Records with PM10 at that moment
183
Records with no main pollutant at that moment
395

In the inspected OAQ snapshot, 598 CPCB station records were present, but 395 had no main pollutant value at that moment.

OAQ is useful because it mirrors current CPCB data in a cleaner API shape. In the 23 June 2026 snapshot, 203 records had at least one of the eight main pollutant fields and 139 had all eight. That makes OAQ valuable for latest visibility and metadata, but not a replacement for CPCB's historical repository audit.

How to readRead the bars as one moment in time. The full station-record bar is the denominator; the pollutant bars show how many records had current values in that snapshot; the no-main-pollutant bar shows how sparse a latest mirror can be.

Watch outDo not treat a latest snapshot as a complete history, or as proof that a station never reports pollutants.

How should you read these numbers?

Read this as a visibility audit, not as a pollution trend. The daily audit used CPCB's public raw repository to list, download and inspect station-year CSVs. The frequency audit counted raw file listings at 15-minute, hourly, 8-hour and daily frequencies. The sub-daily spot-check proved those downloads can work for one station-year. The PM2.5 and PM10 completeness numbers come from downloaded daily files only.

The public surfaces are different. The all-India AQI portal is the current public signal. Advance search is a human-oriented dashboard and export surface. The CPCB raw repository endpoint is reproducible and audited here. OAQ is a clean . The AQI repository is visible in the CPCB interface and app bundle, but a stable public AQI audit endpoint was not reproduced for this article.

The caveats are real. This audit does not validate calibration, siting, QA flags, monitor downtime or whether a station represents a whole city. The statutory-town comparison is a place denominator, not a population denominator. The 300-day threshold is an audit choice for trend-readiness, not an official rule.

What India can say with confidence is that public AQI sits on a real monitoring system, and CPCB exposes more raw public data than casual dashboard browsing suggests. What it cannot honestly say from these files alone is that the public can inspect a long, even, all-India PM2.5 record.