pandas is a data analysis toolkit for Python. It makes it easy to work with certain types of data, specially:
We will see some of pandas' features by working with a dataset of 311 service requests from New York City. This data is stored in a CSV file, which you can open with most spreadsheet software.
[Open the file in LibreOffice/Excel/etc.]
Each row in this file represents a service call (or "complaint") to 311 in New York City. The file has 51 columns, including columns such as:
Etc.
Given this data, let's say we wanted to answer the following questions:
With what we've seen so far this quarter, we already have the ability to answer these questions. We could load this data into a list of complaints, where each complaint is represented as tuple, dictionary, or object, and then use a variety of loops to process that data. However, pandas provides us with higher-level abstractions that will make it much easier to answer these questions. On top of that, pandas also handles many aspects of working with real-world data sets, like dealing with missing data or converting data to the right type (if we read the file manually, we would start with strings for everything, and would have to make sure everything was converted to the right type)
To import the pandas library we do this:
import pandas as pd
The most important data structure in pandas is the DataFrame, which represents tabular data, i.e., a table with rows and columns. For our purposes, we can assume that columns will always have labels (i.e., that a column always has a "name"). If we look at the 311 CSV file in the spreadsheet, this is the header of each column.
Rows can also have unique labels (this is called an "index"). This allows us to refer to individual rows by some unique identifier. If the dataset doesn't already include such an identifier, then pandas will simply label the rows from 0.
So, we are going to load the 311 CSV file using the read_csv
function. It will automatically detect that the first row contains the column labels, but we need to explicitly tell read_csv
that the row labels are contained in column Unique Key
complaints = pd.read_csv('311-service-requests.csv', index_col='Unique Key')
complaints
is a DataFrame. You can think of it as containing a table representing the data in the file.
complaints
Created Date | Closed Date | Agency | Agency Name | Complaint Type | Descriptor | Location Type | Incident Zip | Incident Address | Street Name | ... | Bridge Highway Name | Bridge Highway Direction | Road Ramp | Bridge Highway Segment | Garage Lot Name | Ferry Direction | Ferry Terminal Name | Latitude | Longitude | Location | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Unique Key | |||||||||||||||||||||
26589651 | 10/31/2013 02:08:41 AM | NaN | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11432 | 90-03 169 STREET | 169 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.708275 | -73.791604 | (40.70827532593202, -73.79160395779721) |
26593698 | 10/31/2013 02:01:04 AM | NaN | NYPD | New York City Police Department | Illegal Parking | Commercial Overnight Parking | Street/Sidewalk | 11378 | 58 AVENUE | 58 AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.721041 | -73.909453 | (40.721040535628305, -73.90945306791765) |
26594139 | 10/31/2013 02:00:24 AM | 10/31/2013 02:40:32 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 10032 | 4060 BROADWAY | BROADWAY | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.843330 | -73.939144 | (40.84332975466513, -73.93914371913482) |
26595721 | 10/31/2013 01:56:23 AM | 10/31/2013 02:21:48 AM | NYPD | New York City Police Department | Noise - Vehicle | Car/Truck Horn | Street/Sidewalk | 10023 | WEST 72 STREET | WEST 72 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.778009 | -73.980213 | (40.7780087446372, -73.98021349023975) |
26590930 | 10/31/2013 01:53:44 AM | NaN | DOHMH | Department of Health and Mental Hygiene | Rodent | Condition Attracting Rodents | Vacant Lot | 10027 | WEST 124 STREET | WEST 124 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.807691 | -73.947387 | (40.80769092704951, -73.94738703491433) |
26592370 | 10/31/2013 01:46:52 AM | NaN | NYPD | New York City Police Department | Noise - Commercial | Banging/Pounding | Club/Bar/Restaurant | 11372 | 37 AVENUE | 37 AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.749989 | -73.881988 | (40.7499893014072, -73.88198770727831) |
26595682 | 10/31/2013 01:46:40 AM | NaN | NYPD | New York City Police Department | Blocked Driveway | No Access | Street/Sidewalk | 11419 | 107-50 109 STREET | 109 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.681533 | -73.831737 | (40.68153278675525, -73.83173699701601) |
26595195 | 10/31/2013 01:44:19 AM | 10/31/2013 01:58:49 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 11417 | 137-09 CROSSBAY BOULEVARD | CROSSBAY BOULEVARD | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.671816 | -73.843092 | (40.67181584567338, -73.84309181950769) |
26590540 | 10/31/2013 01:44:14 AM | 10/31/2013 02:28:04 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Talking | Club/Bar/Restaurant | 10011 | 258 WEST 15 STREET | WEST 15 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.739913 | -74.000790 | (40.73991339303542, -74.00079028612932) |
26594392 | 10/31/2013 01:34:41 AM | 10/31/2013 02:23:51 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 11225 | 835 NOSTRAND AVENUE | NOSTRAND AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.668204 | -73.950648 | (40.66820406598287, -73.95064760056546) |
26595176 | 10/31/2013 01:25:12 AM | NaN | NYPD | New York City Police Department | Noise - House of Worship | Loud Music/Party | House of Worship | 11218 | 3775 18 AVENUE | 18 AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.634378 | -73.969462 | (40.63437840816299, -73.96946177104543) |
26591982 | 10/31/2013 01:24:14 AM | 10/31/2013 01:54:39 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 10003 | 187 2 AVENUE | 2 AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.730816 | -73.986073 | (40.73081644089586, -73.98607265739876) |
26594169 | 10/31/2013 01:20:57 AM | 10/31/2013 02:12:31 AM | NYPD | New York City Police Department | Illegal Parking | Double Parked Blocking Vehicle | Street/Sidewalk | 10029 | 65 EAST 99 STREET | EAST 99 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.788974 | -73.952259 | (40.78897400211689, -73.95225898702977) |
26594391 | 10/31/2013 01:20:13 AM | NaN | NYPD | New York City Police Department | Noise - Vehicle | Engine Idling | Street/Sidewalk | 10466 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.891517 | -73.836457 | (40.89151738488846, -73.83645714593568) |
26590917 | 10/31/2013 01:19:54 AM | NaN | DOHMH | Department of Health and Mental Hygiene | Rodent | Rat Sighting | 1-2 Family Mixed Use Building | 11219 | 63 STREET | 63 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.626477 | -73.999218 | (40.6264774690411, -73.99921826202639) |
26591458 | 10/31/2013 01:14:02 AM | 10/31/2013 01:30:34 AM | NYPD | New York City Police Department | Noise - House of Worship | Loud Music/Party | House of Worship | 10025 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.796597 | -73.970370 | (40.7965967075252, -73.97036973473399) |
26594086 | 10/31/2013 12:54:03 AM | 10/31/2013 02:16:39 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 10310 | 173 CAMPBELL AVENUE | CAMPBELL AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.636182 | -74.116150 | (40.63618202176914, -74.1161500428337) |
26595117 | 10/31/2013 12:52:46 AM | NaN | NYPD | New York City Police Department | Illegal Parking | Posted Parking Sign Violation | Street/Sidewalk | 11236 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.632437 | -73.888173 | (40.63243692394328, -73.88817263437012) |
26590389 | 10/31/2013 12:51:00 AM | NaN | DOT | Department of Transportation | Street Light Condition | Street Light Out | NaN | NaN | 226 42 ST E | 42 ST E | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
26594210 | 10/31/2013 12:46:27 AM | NaN | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 10033 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.852058 | -73.934776 | (40.85205827756883, -73.93477640780834) |
26592932 | 10/31/2013 12:43:47 AM | 10/31/2013 12:56:20 AM | NYPD | New York City Police Department | Noise - House of Worship | Loud Music/Party | House of Worship | 11216 | 778 PARK PLACE | PARK PLACE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.673505 | -73.951844 | (40.67350473678714, -73.95184414979961) |
26594152 | 10/31/2013 12:41:17 AM | 10/31/2013 01:04:37 AM | NYPD | New York City Police Department | Noise - Commercial | Banging/Pounding | Store/Commercial | 10016 | 155 E 34TH ST | E 34TH ST | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.746194 | -73.978769 | (40.74619417253121, -73.97876853124392) |
26589678 | 10/31/2013 12:39:55 AM | NaN | NYPD | New York City Police Department | Noise - Vehicle | Car/Truck Music | Street/Sidewalk | 11419 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.692394 | -73.833891 | (40.69239424979043, -73.8338912453996) |
26592304 | 10/31/2013 12:38:00 AM | NaN | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 11216 | 371 TOMPKINS AVENUE | TOMPKINS AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.684944 | -73.944221 | (40.6849442562592, -73.94422078036632) |
26591892 | 10/31/2013 12:37:16 AM | NaN | NYPD | New York City Police Department | Blocked Driveway | Partial Access | Street/Sidewalk | 10305 | 1496 BAY STREET | BAY STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.607245 | -74.061106 | (40.60724493456944, -74.06110566015863) |
26591573 | 10/31/2013 12:35:18 AM | 10/31/2013 02:41:35 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10312 | 24 PRINCETON LANE | PRINCETON LANE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.553421 | -74.196743 | (40.55342078716953, -74.19674315017886) |
26590509 | 10/31/2013 12:33:00 AM | NaN | DOT | Department of Transportation | Street Light Condition | Street Light Out | NaN | NaN | 38 ST E | 38 ST E | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
26591379 | 10/31/2013 12:32:44 AM | NaN | DOHMH | Department of Health and Mental Hygiene | Harboring Bees/Wasps | Bees/Wasps - Not a beekeper | 3+ Family Mixed Use Building | 10025 | 501 WEST 110 STREET | WEST 110 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.803149 | -73.964266 | (40.80314938553783, -73.96426608076082) |
26594085 | 10/31/2013 12:32:08 AM | NaN | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10026 | 121 WEST 116 STREET | WEST 116 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.802390 | -73.950526 | (40.80238950799943, -73.95052644123253) |
26589201 | 10/31/2013 12:32:00 AM | NaN | DOT | Department of Transportation | Street Light Condition | Street Light Out | NaN | 10309 | 295 BAYVIEW AVENUE | BAYVIEW AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.517378 | -74.203435 | (40.517377871705676, -74.20343466779575) |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
26428764 | 10/04/2013 12:17:03 AM | 10/04/2013 12:38:37 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 10022 | 249 EAST 53 STREET | EAST 53 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.757248 | -73.968286 | (40.757247604963055, -73.96828647941395) |
26426166 | 10/04/2013 12:16:22 AM | 10/04/2013 05:50:49 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Store/Commercial | 10029 | 252 EAST 110 STREET | EAST 110 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.793635 | -73.941649 | (40.79363516179995, -73.94164859515777) |
26438565 | 10/04/2013 12:16:00 AM | NaN | DEP | Department of Environmental Protection | Noise | Noise: Construction Before/After Hours (NM1) | NaN | 11231 | 480 VAN BRUNT STREET | VAN BRUNT STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.674249 | -74.016558 | (40.6742492231181, -74.01655803856313) |
26428990 | 10/04/2013 12:15:52 AM | 10/04/2013 12:44:52 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10014 | 733 WASHINGTON STREET | WASHINGTON STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.736304 | -74.008299 | (40.736303747410034, -74.00829935904578) |
26432659 | 10/04/2013 12:15:46 AM | 10/04/2013 04:18:45 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 11218 | 1213 CORTELYOU ROAD | CORTELYOU ROAD | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.640139 | -73.966847 | (40.64013899178716, -73.96684680242933) |
26426096 | 10/04/2013 12:14:09 AM | 10/04/2013 01:03:46 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10032 | 539 WEST 162 STREET | WEST 162 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.836532 | -73.941018 | (40.836532082987944, -73.9410182995914) |
26437764 | 10/04/2013 12:14:00 AM | 10/04/2013 12:14:00 AM | DEP | Department of Environmental Protection | Water System | Dirty Water (WE) | NaN | 10022 | 251 EAST 51 STREET | EAST 51 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.755980 | -73.969171 | (40.75597977288785, -73.96917140638074) |
26436286 | 10/04/2013 12:14:00 AM | NaN | DEP | Department of Environmental Protection | Noise | Noise: Construction Before/After Hours (NM1) | NaN | 11231 | 480 VAN BRUNT STREET | VAN BRUNT STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.674249 | -74.016558 | (40.6742492231181, -74.01655803856313) |
26428989 | 10/04/2013 12:13:08 AM | 10/04/2013 02:12:47 AM | NYPD | New York City Police Department | Illegal Parking | Posted Parking Sign Violation | Street/Sidewalk | 11434 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.656160 | -73.767353 | (40.656160351546845, -73.76735262738222) |
26430030 | 10/04/2013 12:12:07 AM | 10/04/2013 02:45:24 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10027 | 215 WEST 131ST STREET | WEST 131ST STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.813339 | -73.946328 | (40.81333907832113, -73.94632769228208) |
26429663 | 10/04/2013 12:12:07 AM | 10/04/2013 01:03:44 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 11209 | 8915 5 AVENUE | 5 AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.619601 | -74.027826 | (40.619601147364584, -74.02782628492785) |
26437763 | 10/04/2013 12:11:00 AM | NaN | DEP | Department of Environmental Protection | Noise | Noise: Construction Before/After Hours (NM1) | NaN | 10028 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.774612 | -73.948085 | (40.77461174278734, -73.94808472177321) |
26432955 | 10/04/2013 12:08:15 AM | 10/04/2013 12:48:02 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 10009 | 506 EAST 13 STREET | EAST 13 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.729531 | -73.980416 | (40.72953107218499, -73.98041550317102) |
26437035 | 10/04/2013 12:08:00 AM | 10/04/2013 12:13:00 AM | DEP | Department of Environmental Protection | Water System | Dirty Water (WE) | NaN | 10022 | 325 EAST 54 STREET | EAST 54 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.757069 | -73.965933 | (40.75706852462872, -73.96593314322774) |
26433197 | 10/04/2013 12:08:00 AM | 10/04/2013 12:00:00 PM | DSNY | BCC - Queens East | Derelict Vehicles | 14 Derelict Vehicles | Street | 11413 | 220-11 145 AVENUE | 145 AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.664353 | -73.757556 | (40.6643525308449, -73.75755575708348) |
26426060 | 10/04/2013 12:06:39 AM | 10/04/2013 12:31:16 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11224 | 4823 BEACH 48 STREET | BEACH 48 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.577515 | -74.012207 | (40.57751513866536, -74.01220705163807) |
26430628 | 10/04/2013 12:06:28 AM | 10/04/2013 12:21:39 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Talking | Club/Bar/Restaurant | 11209 | 7915 3 AVENUE | 3 AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.628381 | -74.029040 | (40.62838143294541, -74.02904041333245) |
26431648 | 10/04/2013 12:06:26 AM | 10/23/2013 08:14:52 AM | DOT | Department of Transportation | Street Sign - Missing | Bus Stop | Street | 11378 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.732250 | -73.924513 | (40.73225000573841, -73.92451289173367) |
26437034 | 10/04/2013 12:06:00 AM | NaN | DEP | Department of Environmental Protection | Noise | Noise: Jack Hammering (NC2) | NaN | 10036 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.760116 | -73.984836 | (40.76011648520396, -73.98483562074706) |
26426094 | 10/04/2013 12:05:12 AM | 10/04/2013 01:08:29 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 11237 | 211 KNICKERBOCKER AVENUE | KNICKERBOCKER AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.703365 | -73.926345 | (40.703365329011284, -73.92634531031759) |
26429040 | 10/04/2013 12:04:52 AM | 10/04/2013 03:01:04 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10003 | 99 2 AVENUE | 2 AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.727251 | -73.988660 | (40.727251287038676, -73.98866028603422) |
26434084 | 10/04/2013 12:04:00 AM | NaN | DEP | Department of Environmental Protection | Noise | Noise: Construction Before/After Hours (NM1) | NaN | 10036 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.760405 | -73.987474 | (40.76040500039782, -73.98747426235285) |
26426164 | 10/04/2013 12:03:00 AM | 10/04/2013 02:14:57 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 11106 | 30-09 BROADWAY | BROADWAY | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.762279 | -73.926013 | (40.762278541098084, -73.92601303458156) |
26439710 | 10/04/2013 12:03:00 AM | 10/04/2013 12:03:00 AM | DEP | Department of Environmental Protection | Water System | Dirty Water (WE) | NaN | 10022 | 325 EAST 54 STREET | EAST 54 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.757069 | -73.965933 | (40.75706852462872, -73.96593314322774) |
26435569 | 10/04/2013 12:02:00 AM | 10/04/2013 01:10:00 AM | DEP | Department of Environmental Protection | Water System | Dirty Water (WE) | NaN | 10022 | 311 EAST 50 STREET | EAST 50 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.754662 | -73.967992 | (40.75466197318078, -73.96799173864807) |
26426013 | 10/04/2013 12:01:13 AM | 10/07/2013 04:07:16 PM | DPR | Department of Parks and Recreation | Maintenance or Facility | Structure - Outdoors | Park | 11213 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
26428083 | 10/04/2013 12:01:05 AM | 10/04/2013 02:13:50 AM | NYPD | New York City Police Department | Illegal Parking | Posted Parking Sign Violation | Street/Sidewalk | 11434 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.656160 | -73.767353 | (40.656160351546845, -73.76735262738222) |
26428987 | 10/04/2013 12:00:45 AM | 10/04/2013 01:25:01 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10016 | 344 EAST 28 STREET | EAST 28 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.740295 | -73.976952 | (40.740295354643706, -73.97695165980414) |
26426115 | 10/04/2013 12:00:28 AM | 10/04/2013 04:17:32 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Talking | Club/Bar/Restaurant | 11226 | 1233 FLATBUSH AVENUE | FLATBUSH AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.640182 | -73.955306 | (40.64018174662485, -73.95530566958138) |
26428033 | 10/04/2013 12:00:10 AM | 10/04/2013 01:20:52 AM | NYPD | New York City Police Department | Blocked Driveway | Partial Access | Street/Sidewalk | 11236 | 1259 EAST 94 STREET | EAST 94 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.640024 | -73.900717 | (40.640024057399216, -73.90071711703163) |
111069 rows × 51 columns
A few basic operations on DataFrames:
complaints.shape
(111069, 51)
complaints.columns
Index(['Created Date', 'Closed Date', 'Agency', 'Agency Name', 'Complaint Type', 'Descriptor', 'Location Type', 'Incident Zip', 'Incident Address', 'Street Name', 'Cross Street 1', 'Cross Street 2', 'Intersection Street 1', 'Intersection Street 2', 'Address Type', 'City', 'Landmark', 'Facility Type', 'Status', 'Due Date', 'Resolution Action Updated Date', 'Community Board', 'Borough', 'X Coordinate (State Plane)', 'Y Coordinate (State Plane)', 'Park Facility Name', 'Park Borough', 'School Name', 'School Number', 'School Region', 'School Code', 'School Phone Number', 'School Address', 'School City', 'School State', 'School Zip', 'School Not Found', 'School or Citywide Complaint', 'Vehicle Type', 'Taxi Company Borough', 'Taxi Pick Up Location', 'Bridge Highway Name', 'Bridge Highway Direction', 'Road Ramp', 'Bridge Highway Segment', 'Garage Lot Name', 'Ferry Direction', 'Ferry Terminal Name', 'Latitude', 'Longitude', 'Location'], dtype='object')
complaints[:5]
Created Date | Closed Date | Agency | Agency Name | Complaint Type | Descriptor | Location Type | Incident Zip | Incident Address | Street Name | ... | Bridge Highway Name | Bridge Highway Direction | Road Ramp | Bridge Highway Segment | Garage Lot Name | Ferry Direction | Ferry Terminal Name | Latitude | Longitude | Location | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Unique Key | |||||||||||||||||||||
26589651 | 10/31/2013 02:08:41 AM | NaN | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11432 | 90-03 169 STREET | 169 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.708275 | -73.791604 | (40.70827532593202, -73.79160395779721) |
26593698 | 10/31/2013 02:01:04 AM | NaN | NYPD | New York City Police Department | Illegal Parking | Commercial Overnight Parking | Street/Sidewalk | 11378 | 58 AVENUE | 58 AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.721041 | -73.909453 | (40.721040535628305, -73.90945306791765) |
26594139 | 10/31/2013 02:00:24 AM | 10/31/2013 02:40:32 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 10032 | 4060 BROADWAY | BROADWAY | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.843330 | -73.939144 | (40.84332975466513, -73.93914371913482) |
26595721 | 10/31/2013 01:56:23 AM | 10/31/2013 02:21:48 AM | NYPD | New York City Police Department | Noise - Vehicle | Car/Truck Horn | Street/Sidewalk | 10023 | WEST 72 STREET | WEST 72 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.778009 | -73.980213 | (40.7780087446372, -73.98021349023975) |
26590930 | 10/31/2013 01:53:44 AM | NaN | DOHMH | Department of Health and Mental Hygiene | Rodent | Condition Attracting Rodents | Vacant Lot | 10027 | WEST 124 STREET | WEST 124 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.807691 | -73.947387 | (40.80769092704951, -73.94738703491433) |
5 rows × 51 columns
We can access an individual column using square brackets. This returns a Series
object, which represents a one-dimensional labelled array. In this case, the array is labelled by Unique Key
.
complaints['Complaint Type']
Unique Key 26589651 Noise - Street/Sidewalk 26593698 Illegal Parking 26594139 Noise - Commercial 26595721 Noise - Vehicle 26590930 Rodent 26592370 Noise - Commercial 26595682 Blocked Driveway 26595195 Noise - Commercial 26590540 Noise - Commercial 26594392 Noise - Commercial 26595176 Noise - House of Worship 26591982 Noise - Commercial 26594169 Illegal Parking 26594391 Noise - Vehicle 26590917 Rodent 26591458 Noise - House of Worship 26594086 Noise - Street/Sidewalk 26595117 Illegal Parking 26590389 Street Light Condition 26594210 Noise - Commercial 26592932 Noise - House of Worship 26594152 Noise - Commercial 26589678 Noise - Vehicle 26592304 Noise - Commercial 26591892 Blocked Driveway 26591573 Noise - Street/Sidewalk 26590509 Street Light Condition 26591379 Harboring Bees/Wasps 26594085 Noise - Street/Sidewalk 26589201 Street Light Condition ... 26428764 Noise - Commercial 26426166 Noise - Commercial 26438565 Noise 26428990 Noise - Street/Sidewalk 26432659 Noise - Commercial 26426096 Noise - Street/Sidewalk 26437764 Water System 26436286 Noise 26428989 Illegal Parking 26430030 Noise - Street/Sidewalk 26429663 Noise - Commercial 26437763 Noise 26432955 Noise - Commercial 26437035 Water System 26433197 Derelict Vehicles 26426060 Noise - Street/Sidewalk 26430628 Noise - Commercial 26431648 Street Sign - Missing 26437034 Noise 26426094 Noise - Commercial 26429040 Noise - Street/Sidewalk 26434084 Noise 26426164 Noise - Commercial 26439710 Water System 26435569 Water System 26426013 Maintenance or Facility 26428083 Illegal Parking 26428987 Noise - Street/Sidewalk 26426115 Noise - Commercial 26428033 Blocked Driveway Name: Complaint Type, dtype: object
However, if we select multiple columns, we get back a DataFrame:
complaints[['Complaint Type', 'Borough']]
Complaint Type | Borough | |
---|---|---|
Unique Key | ||
26589651 | Noise - Street/Sidewalk | QUEENS |
26593698 | Illegal Parking | QUEENS |
26594139 | Noise - Commercial | MANHATTAN |
26595721 | Noise - Vehicle | MANHATTAN |
26590930 | Rodent | MANHATTAN |
26592370 | Noise - Commercial | QUEENS |
26595682 | Blocked Driveway | QUEENS |
26595195 | Noise - Commercial | QUEENS |
26590540 | Noise - Commercial | MANHATTAN |
26594392 | Noise - Commercial | BROOKLYN |
26595176 | Noise - House of Worship | BROOKLYN |
26591982 | Noise - Commercial | MANHATTAN |
26594169 | Illegal Parking | MANHATTAN |
26594391 | Noise - Vehicle | BRONX |
26590917 | Rodent | BROOKLYN |
26591458 | Noise - House of Worship | MANHATTAN |
26594086 | Noise - Street/Sidewalk | STATEN ISLAND |
26595117 | Illegal Parking | BROOKLYN |
26590389 | Street Light Condition | BROOKLYN |
26594210 | Noise - Commercial | MANHATTAN |
26592932 | Noise - House of Worship | BROOKLYN |
26594152 | Noise - Commercial | MANHATTAN |
26589678 | Noise - Vehicle | QUEENS |
26592304 | Noise - Commercial | BROOKLYN |
26591892 | Blocked Driveway | STATEN ISLAND |
26591573 | Noise - Street/Sidewalk | STATEN ISLAND |
26590509 | Street Light Condition | BROOKLYN |
26591379 | Harboring Bees/Wasps | MANHATTAN |
26594085 | Noise - Street/Sidewalk | MANHATTAN |
26589201 | Street Light Condition | STATEN ISLAND |
... | ... | ... |
26428764 | Noise - Commercial | MANHATTAN |
26426166 | Noise - Commercial | MANHATTAN |
26438565 | Noise | BROOKLYN |
26428990 | Noise - Street/Sidewalk | MANHATTAN |
26432659 | Noise - Commercial | BROOKLYN |
26426096 | Noise - Street/Sidewalk | MANHATTAN |
26437764 | Water System | MANHATTAN |
26436286 | Noise | BROOKLYN |
26428989 | Illegal Parking | QUEENS |
26430030 | Noise - Street/Sidewalk | MANHATTAN |
26429663 | Noise - Commercial | BROOKLYN |
26437763 | Noise | MANHATTAN |
26432955 | Noise - Commercial | MANHATTAN |
26437035 | Water System | MANHATTAN |
26433197 | Derelict Vehicles | QUEENS |
26426060 | Noise - Street/Sidewalk | BROOKLYN |
26430628 | Noise - Commercial | BROOKLYN |
26431648 | Street Sign - Missing | QUEENS |
26437034 | Noise | MANHATTAN |
26426094 | Noise - Commercial | BROOKLYN |
26429040 | Noise - Street/Sidewalk | MANHATTAN |
26434084 | Noise | MANHATTAN |
26426164 | Noise - Commercial | QUEENS |
26439710 | Water System | MANHATTAN |
26435569 | Water System | MANHATTAN |
26426013 | Maintenance or Facility | BROOKLYN |
26428083 | Illegal Parking | QUEENS |
26428987 | Noise - Street/Sidewalk | MANHATTAN |
26426115 | Noise - Commercial | BROOKLYN |
26428033 | Blocked Driveway | BROOKLYN |
111069 rows × 2 columns
We can access an individual row using its label:
complaints.loc[26589651]
Created Date 10/31/2013 02:08:41 AM Closed Date NaN Agency NYPD Agency Name New York City Police Department Complaint Type Noise - Street/Sidewalk Descriptor Loud Talking Location Type Street/Sidewalk Incident Zip 11432 Incident Address 90-03 169 STREET Street Name 169 STREET Cross Street 1 90 AVENUE Cross Street 2 91 AVENUE Intersection Street 1 NaN Intersection Street 2 NaN Address Type ADDRESS City JAMAICA Landmark NaN Facility Type Precinct Status Assigned Due Date 10/31/2013 10:08:41 AM Resolution Action Updated Date 10/31/2013 02:35:17 AM Community Board 12 QUEENS Borough QUEENS X Coordinate (State Plane) 1.04203e+06 Y Coordinate (State Plane) 197389 Park Facility Name Unspecified Park Borough QUEENS School Name Unspecified School Number Unspecified School Region Unspecified School Code Unspecified School Phone Number Unspecified School Address Unspecified School City Unspecified School State Unspecified School Zip Unspecified School Not Found N School or Citywide Complaint NaN Vehicle Type NaN Taxi Company Borough NaN Taxi Pick Up Location NaN Bridge Highway Name NaN Bridge Highway Direction NaN Road Ramp NaN Bridge Highway Segment NaN Garage Lot Name NaN Ferry Direction NaN Ferry Terminal Name NaN Latitude 40.7083 Longitude -73.7916 Location (40.70827532593202, -73.79160395779721) Name: 26589651, dtype: object
Or by its row number:
complaints.iloc[0]
Created Date 10/31/2013 02:08:41 AM Closed Date NaN Agency NYPD Agency Name New York City Police Department Complaint Type Noise - Street/Sidewalk Descriptor Loud Talking Location Type Street/Sidewalk Incident Zip 11432 Incident Address 90-03 169 STREET Street Name 169 STREET Cross Street 1 90 AVENUE Cross Street 2 91 AVENUE Intersection Street 1 NaN Intersection Street 2 NaN Address Type ADDRESS City JAMAICA Landmark NaN Facility Type Precinct Status Assigned Due Date 10/31/2013 10:08:41 AM Resolution Action Updated Date 10/31/2013 02:35:17 AM Community Board 12 QUEENS Borough QUEENS X Coordinate (State Plane) 1.04203e+06 Y Coordinate (State Plane) 197389 Park Facility Name Unspecified Park Borough QUEENS School Name Unspecified School Number Unspecified School Region Unspecified School Code Unspecified School Phone Number Unspecified School Address Unspecified School City Unspecified School State Unspecified School Zip Unspecified School Not Found N School or Citywide Complaint NaN Vehicle Type NaN Taxi Company Borough NaN Taxi Pick Up Location NaN Bridge Highway Name NaN Bridge Highway Direction NaN Road Ramp NaN Bridge Highway Segment NaN Garage Lot Name NaN Ferry Direction NaN Ferry Terminal Name NaN Latitude 40.7083 Longitude -73.7916 Location (40.70827532593202, -73.79160395779721) Name: 26589651, dtype: object
Quick aside: if we don't specify an index column, then row number becomes the index column.
complaints_no_index = pd.read_csv('311-service-requests.csv')
complaints_no_index[:5]
Unique Key | Created Date | Closed Date | Agency | Agency Name | Complaint Type | Descriptor | Location Type | Incident Zip | Incident Address | ... | Bridge Highway Name | Bridge Highway Direction | Road Ramp | Bridge Highway Segment | Garage Lot Name | Ferry Direction | Ferry Terminal Name | Latitude | Longitude | Location | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 26589651 | 10/31/2013 02:08:41 AM | NaN | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11432 | 90-03 169 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.708275 | -73.791604 | (40.70827532593202, -73.79160395779721) |
1 | 26593698 | 10/31/2013 02:01:04 AM | NaN | NYPD | New York City Police Department | Illegal Parking | Commercial Overnight Parking | Street/Sidewalk | 11378 | 58 AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.721041 | -73.909453 | (40.721040535628305, -73.90945306791765) |
2 | 26594139 | 10/31/2013 02:00:24 AM | 10/31/2013 02:40:32 AM | NYPD | New York City Police Department | Noise - Commercial | Loud Music/Party | Club/Bar/Restaurant | 10032 | 4060 BROADWAY | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.843330 | -73.939144 | (40.84332975466513, -73.93914371913482) |
3 | 26595721 | 10/31/2013 01:56:23 AM | 10/31/2013 02:21:48 AM | NYPD | New York City Police Department | Noise - Vehicle | Car/Truck Horn | Street/Sidewalk | 10023 | WEST 72 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.778009 | -73.980213 | (40.7780087446372, -73.98021349023975) |
4 | 26590930 | 10/31/2013 01:53:44 AM | NaN | DOHMH | Department of Health and Mental Hygiene | Rodent | Condition Attracting Rodents | Vacant Lot | 10027 | WEST 124 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.807691 | -73.947387 | (40.80769092704951, -73.94738703491433) |
5 rows × 52 columns
complaints_no_index[['Complaint Type', 'Borough']]
Complaint Type | Borough | |
---|---|---|
0 | Noise - Street/Sidewalk | QUEENS |
1 | Illegal Parking | QUEENS |
2 | Noise - Commercial | MANHATTAN |
3 | Noise - Vehicle | MANHATTAN |
4 | Rodent | MANHATTAN |
5 | Noise - Commercial | QUEENS |
6 | Blocked Driveway | QUEENS |
7 | Noise - Commercial | QUEENS |
8 | Noise - Commercial | MANHATTAN |
9 | Noise - Commercial | BROOKLYN |
10 | Noise - House of Worship | BROOKLYN |
11 | Noise - Commercial | MANHATTAN |
12 | Illegal Parking | MANHATTAN |
13 | Noise - Vehicle | BRONX |
14 | Rodent | BROOKLYN |
15 | Noise - House of Worship | MANHATTAN |
16 | Noise - Street/Sidewalk | STATEN ISLAND |
17 | Illegal Parking | BROOKLYN |
18 | Street Light Condition | BROOKLYN |
19 | Noise - Commercial | MANHATTAN |
20 | Noise - House of Worship | BROOKLYN |
21 | Noise - Commercial | MANHATTAN |
22 | Noise - Vehicle | QUEENS |
23 | Noise - Commercial | BROOKLYN |
24 | Blocked Driveway | STATEN ISLAND |
25 | Noise - Street/Sidewalk | STATEN ISLAND |
26 | Street Light Condition | BROOKLYN |
27 | Harboring Bees/Wasps | MANHATTAN |
28 | Noise - Street/Sidewalk | MANHATTAN |
29 | Street Light Condition | STATEN ISLAND |
... | ... | ... |
111039 | Noise - Commercial | MANHATTAN |
111040 | Noise - Commercial | MANHATTAN |
111041 | Noise | BROOKLYN |
111042 | Noise - Street/Sidewalk | MANHATTAN |
111043 | Noise - Commercial | BROOKLYN |
111044 | Noise - Street/Sidewalk | MANHATTAN |
111045 | Water System | MANHATTAN |
111046 | Noise | BROOKLYN |
111047 | Illegal Parking | QUEENS |
111048 | Noise - Street/Sidewalk | MANHATTAN |
111049 | Noise - Commercial | BROOKLYN |
111050 | Noise | MANHATTAN |
111051 | Noise - Commercial | MANHATTAN |
111052 | Water System | MANHATTAN |
111053 | Derelict Vehicles | QUEENS |
111054 | Noise - Street/Sidewalk | BROOKLYN |
111055 | Noise - Commercial | BROOKLYN |
111056 | Street Sign - Missing | QUEENS |
111057 | Noise | MANHATTAN |
111058 | Noise - Commercial | BROOKLYN |
111059 | Noise - Street/Sidewalk | MANHATTAN |
111060 | Noise | MANHATTAN |
111061 | Noise - Commercial | QUEENS |
111062 | Water System | MANHATTAN |
111063 | Water System | MANHATTAN |
111064 | Maintenance or Facility | BROOKLYN |
111065 | Illegal Parking | QUEENS |
111066 | Noise - Street/Sidewalk | MANHATTAN |
111067 | Noise - Commercial | BROOKLYN |
111068 | Blocked Driveway | BROOKLYN |
111069 rows × 2 columns
complaints_no_index.loc[0]
Unique Key 26589651 Created Date 10/31/2013 02:08:41 AM Closed Date NaN Agency NYPD Agency Name New York City Police Department Complaint Type Noise - Street/Sidewalk Descriptor Loud Talking Location Type Street/Sidewalk Incident Zip 11432 Incident Address 90-03 169 STREET Street Name 169 STREET Cross Street 1 90 AVENUE Cross Street 2 91 AVENUE Intersection Street 1 NaN Intersection Street 2 NaN Address Type ADDRESS City JAMAICA Landmark NaN Facility Type Precinct Status Assigned Due Date 10/31/2013 10:08:41 AM Resolution Action Updated Date 10/31/2013 02:35:17 AM Community Board 12 QUEENS Borough QUEENS X Coordinate (State Plane) 1.04203e+06 Y Coordinate (State Plane) 197389 Park Facility Name Unspecified Park Borough QUEENS School Name Unspecified School Number Unspecified School Region Unspecified School Code Unspecified School Phone Number Unspecified School Address Unspecified School City Unspecified School State Unspecified School Zip Unspecified School Not Found N School or Citywide Complaint NaN Vehicle Type NaN Taxi Company Borough NaN Taxi Pick Up Location NaN Bridge Highway Name NaN Bridge Highway Direction NaN Road Ramp NaN Bridge Highway Segment NaN Garage Lot Name NaN Ferry Direction NaN Ferry Terminal Name NaN Latitude 40.7083 Longitude -73.7916 Location (40.70827532593202, -73.79160395779721) Name: 0, dtype: object
--> Basic operations on DataFrames: http://pandas.pydata.org/pandas-docs/stable/basics.html
complaints['Complaint Type']
Unique Key 26589651 Noise - Street/Sidewalk 26593698 Illegal Parking 26594139 Noise - Commercial 26595721 Noise - Vehicle 26590930 Rodent 26592370 Noise - Commercial 26595682 Blocked Driveway 26595195 Noise - Commercial 26590540 Noise - Commercial 26594392 Noise - Commercial 26595176 Noise - House of Worship 26591982 Noise - Commercial 26594169 Illegal Parking 26594391 Noise - Vehicle 26590917 Rodent 26591458 Noise - House of Worship 26594086 Noise - Street/Sidewalk 26595117 Illegal Parking 26590389 Street Light Condition 26594210 Noise - Commercial 26592932 Noise - House of Worship 26594152 Noise - Commercial 26589678 Noise - Vehicle 26592304 Noise - Commercial 26591892 Blocked Driveway 26591573 Noise - Street/Sidewalk 26590509 Street Light Condition 26591379 Harboring Bees/Wasps 26594085 Noise - Street/Sidewalk 26589201 Street Light Condition ... 26428764 Noise - Commercial 26426166 Noise - Commercial 26438565 Noise 26428990 Noise - Street/Sidewalk 26432659 Noise - Commercial 26426096 Noise - Street/Sidewalk 26437764 Water System 26436286 Noise 26428989 Illegal Parking 26430030 Noise - Street/Sidewalk 26429663 Noise - Commercial 26437763 Noise 26432955 Noise - Commercial 26437035 Water System 26433197 Derelict Vehicles 26426060 Noise - Street/Sidewalk 26430628 Noise - Commercial 26431648 Street Sign - Missing 26437034 Noise 26426094 Noise - Commercial 26429040 Noise - Street/Sidewalk 26434084 Noise 26426164 Noise - Commercial 26439710 Water System 26435569 Water System 26426013 Maintenance or Facility 26428083 Illegal Parking 26428987 Noise - Street/Sidewalk 26426115 Noise - Commercial 26428033 Blocked Driveway Name: Complaint Type, dtype: object
The value_counts
method will tally up the number of times a value appears in a column, and will return a Series
with the counts, in descending order.
ctcounts = complaints['Complaint Type'].value_counts()
ctcounts
HEATING 14200 GENERAL CONSTRUCTION 7471 Street Light Condition 7117 DOF Literature Request 5797 PLUMBING 5373 PAINT - PLASTER 5149 Blocked Driveway 4590 NONCONST 3998 Street Condition 3473 Illegal Parking 3343 Noise 3321 Traffic Signal Condition 3145 Dirty Conditions 2653 Water System 2636 Noise - Commercial 2578 ELECTRIC 2350 Broken Muni Meter 2070 Noise - Street/Sidewalk 1928 Sanitation Condition 1824 Rodent 1632 Sewer 1627 Consumer Complaint 1227 Taxi Complaint 1227 Damaged Tree 1180 Overgrown Tree/Branches 1083 Graffiti 973 Missed Collection (All Materials) 973 Building/Use 942 Root/Sewer/Sidewalk Condition 836 Derelict Vehicle 803 ... Miscellaneous Categories 5 Posting Advertisement 5 Internal Code 5 Poison Ivy 5 Fire Alarm - Modification 5 Special Natural Area District (SNAD) 4 Transportation Provider Complaint 4 Illegal Animal Sold 4 Ferry Complaint 4 Invitation 3 Fire Alarm - Replacement 3 Illegal Fireworks 3 Adopt-A-Basket 3 Window Guard 2 Misc. Comments 2 Opinion for the Mayor 2 Public Assembly 2 Legal Services Provider Complaint 2 DFTA Literature Request 2 Highway Sign - Damaged 1 Open Flame Permit 1 Stalled Sites 1 Ferry Permit 1 Trans Fat 1 Municipal Parking Facility 1 DWD 1 DHS Income Savings Requirement 1 Snow 1 X-Ray Machine/Equipment 1 Tunnel Condition 1 Name: Complaint Type, dtype: int64
ctcounts[:10]
HEATING 14200 GENERAL CONSTRUCTION 7471 Street Light Condition 7117 DOF Literature Request 5797 PLUMBING 5373 PAINT - PLASTER 5149 Blocked Driveway 4590 NONCONST 3998 Street Condition 3473 Illegal Parking 3343 Name: Complaint Type, dtype: int64
--> Visualization functions: http://pandas.pydata.org/pandas-docs/stable/visualization.html
%matplotlib inline
import matplotlib.pyplot as plt
ctcounts[:10].plot("bar")
<matplotlib.axes._subplots.AxesSubplot at 0x7f2968113b70>
complaints['Latitude'].plot('hist', bins=50),
(<matplotlib.axes._subplots.AxesSubplot at 0x7f29636c6908>,)
complaints.plot(kind='scatter', x='Longitude', y='Latitude')
<matplotlib.axes._subplots.AxesSubplot at 0x7f2963033e80>
--> Indexing in DataFrames: http://pandas.pydata.org/pandas-docs/stable/indexing.html
This is similar to fancy indexing in NumPy. We can apply operations to columns, including comparison operators:
is_noise = complaints['Complaint Type'] == "Noise - Street/Sidewalk"
is_noise
Unique Key 26589651 True 26593698 False 26594139 False 26595721 False 26590930 False 26592370 False 26595682 False 26595195 False 26590540 False 26594392 False 26595176 False 26591982 False 26594169 False 26594391 False 26590917 False 26591458 False 26594086 True 26595117 False 26590389 False 26594210 False 26592932 False 26594152 False 26589678 False 26592304 False 26591892 False 26591573 True 26590509 False 26591379 False 26594085 True 26589201 False ... 26428764 False 26426166 False 26438565 False 26428990 True 26432659 False 26426096 True 26437764 False 26436286 False 26428989 False 26430030 True 26429663 False 26437763 False 26432955 False 26437035 False 26433197 False 26426060 True 26430628 False 26431648 False 26437034 False 26426094 False 26429040 True 26434084 False 26426164 False 26439710 False 26435569 False 26426013 False 26428083 False 26428987 True 26426115 False 26428033 False Name: Complaint Type, dtype: bool
This just returns a series of True/False values, but we can plug this into the square brackets to only get the rows that meet that condition:
noise_complaints = complaints[is_noise]
noise_complaints
Created Date | Closed Date | Agency | Agency Name | Complaint Type | Descriptor | Location Type | Incident Zip | Incident Address | Street Name | ... | Bridge Highway Name | Bridge Highway Direction | Road Ramp | Bridge Highway Segment | Garage Lot Name | Ferry Direction | Ferry Terminal Name | Latitude | Longitude | Location | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Unique Key | |||||||||||||||||||||
26589651 | 10/31/2013 02:08:41 AM | NaN | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11432 | 90-03 169 STREET | 169 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.708275 | -73.791604 | (40.70827532593202, -73.79160395779721) |
26594086 | 10/31/2013 12:54:03 AM | 10/31/2013 02:16:39 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 10310 | 173 CAMPBELL AVENUE | CAMPBELL AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.636182 | -74.116150 | (40.63618202176914, -74.1161500428337) |
26591573 | 10/31/2013 12:35:18 AM | 10/31/2013 02:41:35 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10312 | 24 PRINCETON LANE | PRINCETON LANE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.553421 | -74.196743 | (40.55342078716953, -74.19674315017886) |
26594085 | 10/31/2013 12:32:08 AM | NaN | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10026 | 121 WEST 116 STREET | WEST 116 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.802390 | -73.950526 | (40.80238950799943, -73.95052644123253) |
26595564 | 10/31/2013 12:30:36 AM | NaN | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 11236 | AVENUE J | AVENUE J | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.634104 | -73.911055 | (40.634103775951736, -73.91105541883589) |
26590446 | 10/31/2013 12:11:58 AM | 10/31/2013 01:54:38 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 10459 | 819 EAST 167 STREET | EAST 167 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.826623 | -73.899965 | (40.826622810177874, -73.8999653556452) |
26595553 | 10/31/2013 12:05:10 AM | 10/31/2013 02:43:43 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11225 | 25 LEFFERTS AVENUE | LEFFERTS AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.661793 | -73.959934 | (40.6617931276793, -73.95993363978067) |
26590695 | 10/30/2013 11:40:52 PM | NaN | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10027 | 449 WEST 125 STREET | WEST 125 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.812717 | -73.955678 | (40.81271725522785, -73.95567750031833) |
26594653 | 10/30/2013 11:26:32 PM | 10/31/2013 12:18:54 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 11222 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.724600 | -73.954271 | (40.724599563793525, -73.95427134534344) |
26594760 | 10/30/2013 11:17:43 PM | 10/31/2013 01:05:43 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 10005 | 67 WALL STREET | WALL STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.705749 | -74.008342 | (40.70574909022401, -74.00834244292587) |
26593220 | 10/30/2013 11:14:15 PM | 10/31/2013 01:28:20 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 10012 | 2 SPRING STREET | SPRING STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.721005 | -73.993990 | (40.721004620274186, -73.99398975732854) |
26590217 | 10/30/2013 11:11:28 PM | 10/31/2013 01:29:28 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10003 | 24 EAST 11 STREET | EAST 11 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.733587 | -73.994014 | (40.73358658490167, -73.99401388034954) |
26593483 | 10/30/2013 10:56:50 PM | NaN | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11102 | 4-10 27 AVENUE | 27 AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.774670 | -73.933296 | (40.77467000453749, -73.93329608172101) |
26593680 | 10/30/2013 10:43:14 PM | 10/30/2013 11:35:07 PM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11414 | 160-37 78 STREET | 78 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.654524 | -73.853361 | (40.654523538423916, -73.85336063975151) |
26593530 | 10/30/2013 10:42:13 PM | 10/30/2013 11:58:00 PM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10040 | 570 WEST 193 STREET | WEST 193 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.856599 | -73.927781 | (40.85659933835734, -73.92778065178295) |
26591036 | 10/30/2013 10:33:59 PM | 10/31/2013 02:35:20 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10472 | 1207 BEACH AVENUE | BEACH AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.830362 | -73.866022 | (40.83036235589997, -73.86602154214397) |
26594799 | 10/30/2013 10:31:49 PM | 10/31/2013 02:58:14 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 10003 | 17 IRVING PLACE | IRVING PLACE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.734915 | -73.988450 | (40.734914622073454, -73.98844969804303) |
26591076 | 10/30/2013 10:31:11 PM | 10/31/2013 12:40:39 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10452 | 1565 TOWNSEND AVENUE | TOWNSEND AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.843426 | -73.914263 | (40.84342647770325, -73.91426327900784) |
26589308 | 10/30/2013 10:24:07 PM | 10/30/2013 11:31:52 PM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10031 | 529 WEST 151 STREET | WEST 151 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.829434 | -73.945791 | (40.82943385321531, -73.94579075649952) |
26592708 | 10/30/2013 10:18:13 PM | NaN | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11365 | 67-22 PARSONS BOULEVARD | PARSONS BOULEVARD | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.733500 | -73.810710 | (40.73349978477908, -73.81071026404766) |
26591992 | 10/30/2013 10:02:58 PM | 10/30/2013 10:23:20 PM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11218 | DITMAS AVENUE | DITMAS AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.636169 | -73.972455 | (40.63616876563881, -73.97245504682485) |
26589309 | 10/30/2013 09:48:55 PM | 10/30/2013 10:34:58 PM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10452 | TOWNSEND AVENUE | TOWNSEND AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.843986 | -73.913851 | (40.84398609069672, -73.9138505321433) |
26595576 | 10/30/2013 08:48:47 PM | 10/30/2013 08:54:39 PM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10452 | 1565 TOWNSEND AVENUE | TOWNSEND AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.843567 | -73.914386 | (40.843566548976696, -73.91438598253538) |
26594339 | 10/30/2013 08:40:17 PM | 10/30/2013 09:07:04 PM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 10030 | 222 WEST 134 STREET | WEST 134 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.815016 | -73.945123 | (40.81501553603748, -73.94512331602647) |
26594167 | 10/30/2013 08:38:25 PM | 10/30/2013 10:26:28 PM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 11218 | 126 BEVERLY ROAD | BEVERLY ROAD | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.642922 | -73.978762 | (40.6429222774404, -73.97876175474585) |
26590110 | 10/30/2013 08:32:13 PM | 10/30/2013 11:58:00 PM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11210 | 2640 NOSTRAND AVENUE | NOSTRAND AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.620390 | -73.945676 | (40.620390414106915, -73.94567596310624) |
26590005 | 10/30/2013 08:02:48 PM | 10/30/2013 08:28:48 PM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10452 | 1565 TOWNSEND AVENUE | TOWNSEND AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.843426 | -73.914263 | (40.84342647770325, -73.91426327900784) |
26594205 | 10/30/2013 07:44:49 PM | 10/30/2013 10:28:33 PM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10452 | 1565 TOWNSEND AVENUE | TOWNSEND AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.843567 | -73.914386 | (40.843566548976696, -73.91438598253538) |
26590556 | 10/30/2013 06:59:49 PM | 10/30/2013 09:30:52 PM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10022 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.762179 | -73.966129 | (40.762179295833846, -73.96612906924031) |
26593218 | 10/30/2013 06:24:42 PM | 10/30/2013 07:31:54 PM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10452 | 1565 TOWNSEND AVENUE | TOWNSEND AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.843567 | -73.914386 | (40.843566548976696, -73.91438598253538) |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
26434521 | 10/04/2013 03:46:11 AM | 10/04/2013 05:26:12 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 10012 | 3 BLEEKER STREET | BLEEKER STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.725311 | -73.992478 | (40.72531105665101, -73.99247769162362) |
26433417 | 10/04/2013 03:25:00 AM | 10/04/2013 07:13:56 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10033 | 622 WEST 182 STREET | WEST 182 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.850150 | -73.933932 | (40.850150227792625, -73.9339324637959) |
26432543 | 10/04/2013 02:31:55 AM | 10/04/2013 06:38:00 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11238 | 400 LINCOLN PLACE | LINCOLN PLACE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.672872 | -73.963294 | (40.67287217154439, -73.96329404057829) |
26427965 | 10/04/2013 02:29:50 AM | 10/04/2013 06:37:42 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 11238 | 840 WASHINGTON AVE | WASHINGTON AVE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.672935 | -73.963128 | (40.672935248350534, -73.96312817637728) |
26428058 | 10/04/2013 02:27:21 AM | 10/04/2013 06:37:04 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11238 | 407 LINCOLN PLACE | LINCOLN PLACE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.672831 | -73.963024 | (40.67283091333569, -73.96302368954532) |
26426116 | 10/04/2013 02:27:09 AM | 10/04/2013 04:03:08 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 10458 | 2311 CROTONA AVENUE | CROTONA AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.852362 | -73.884423 | (40.85236190814451, -73.88442312851471) |
26430616 | 10/04/2013 02:23:35 AM | 10/04/2013 04:24:51 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10026 | 121 WEST 116 STREET | WEST 116 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.802390 | -73.950526 | (40.80238950799943, -73.95052644123253) |
26426026 | 10/04/2013 02:15:12 AM | 10/04/2013 06:36:54 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11238 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.672600 | -73.962753 | (40.672600264989946, -73.96275344461327) |
26428931 | 10/04/2013 02:10:04 AM | 10/04/2013 03:02:08 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10016 | EAST 28 STREET | EAST 28 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.742308 | -73.981660 | (40.74230810331397, -73.98166037217192) |
26427966 | 10/04/2013 01:47:46 AM | 10/04/2013 04:42:19 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11230 | 630 RUGBY ROAD | RUGBY ROAD | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.634234 | -73.964065 | (40.63423411912724, -73.96406469568188) |
26428936 | 10/04/2013 01:35:41 AM | 10/04/2013 06:48:33 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10027 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.815527 | -73.959718 | (40.81552658739026, -73.95971825500413) |
26428984 | 10/04/2013 01:30:27 AM | 10/04/2013 02:49:38 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 11412 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.707013 | -73.765790 | (40.70701250870174, -73.76579012401083) |
26429664 | 10/04/2013 01:18:37 AM | 10/04/2013 03:24:57 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 10459 | 973 UNION AVENUE | UNION AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.823432 | -73.901585 | (40.82343209751158, -73.90158522761199) |
26431596 | 10/04/2013 01:16:01 AM | 10/04/2013 02:02:20 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11692 | 55-05 BEACH CHANNEL DRIVE | BEACH CHANNEL DRIVE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.594979 | -73.785833 | (40.594979009156205, -73.7858326072286) |
26425971 | 10/04/2013 01:13:24 AM | 10/04/2013 06:36:46 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10467 | 375 EAST MOSHOLU PARKWAY NORTH | EAST MOSHOLU PARKWAY NORTH | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.870662 | -73.880049 | (40.870661854738415, -73.8800488518787) |
26431562 | 10/04/2013 01:02:45 AM | 10/04/2013 04:43:24 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 11230 | 630 RUGBY ROAD | RUGBY ROAD | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.634234 | -73.964065 | (40.63423411912724, -73.96406469568188) |
26431994 | 10/04/2013 12:59:57 AM | 10/04/2013 04:43:24 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11230 | 630 RUGBY ROAD | RUGBY ROAD | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.634234 | -73.964065 | (40.63423411912724, -73.96406469568188) |
26431930 | 10/04/2013 12:55:46 AM | 10/04/2013 07:12:52 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10040 | 108 ELLWOOD STREET | ELLWOOD STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.861829 | -73.929456 | (40.8618290295582, -73.9294560541168) |
26431874 | 10/04/2013 12:55:42 AM | 10/04/2013 02:44:20 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10027 | 213 W 131ST ST | W 131ST ST | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.813284 | -73.946252 | (40.81328414856855, -73.94625187216151) |
26430877 | 10/04/2013 12:52:45 AM | 10/04/2013 07:07:42 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10456 | 1314 GRAND CONCOURSE | GRAND CONCOURSE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.836664 | -73.915584 | (40.836664485984976, -73.91558382507789) |
26432669 | 10/04/2013 12:42:45 AM | 10/04/2013 06:48:20 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11249 | 125 TAYLOR STREET | TAYLOR STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.705733 | -73.963381 | (40.70573258854396, -73.96338054284558) |
26430709 | 10/04/2013 12:23:47 AM | 10/04/2013 02:45:23 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10030 | 145 WEST 138 STREET | WEST 138 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.816597 | -73.940783 | (40.81659709236496, -73.94078307354462) |
26427751 | 10/04/2013 12:22:50 AM | 10/04/2013 01:03:44 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10032 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.833488 | -73.941578 | (40.83348848222364, -73.94157750579586) |
26431005 | 10/04/2013 12:21:04 AM | 10/04/2013 01:50:26 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Music/Party | Street/Sidewalk | 11205 | 382 WILLOUGHBY AVENUE | WILLOUGHBY AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.692956 | -73.955350 | (40.692955608285715, -73.95534962720475) |
26428990 | 10/04/2013 12:15:52 AM | 10/04/2013 12:44:52 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10014 | 733 WASHINGTON STREET | WASHINGTON STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.736304 | -74.008299 | (40.736303747410034, -74.00829935904578) |
26426096 | 10/04/2013 12:14:09 AM | 10/04/2013 01:03:46 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10032 | 539 WEST 162 STREET | WEST 162 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.836532 | -73.941018 | (40.836532082987944, -73.9410182995914) |
26430030 | 10/04/2013 12:12:07 AM | 10/04/2013 02:45:24 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10027 | 215 WEST 131ST STREET | WEST 131ST STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.813339 | -73.946328 | (40.81333907832113, -73.94632769228208) |
26426060 | 10/04/2013 12:06:39 AM | 10/04/2013 12:31:16 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 11224 | 4823 BEACH 48 STREET | BEACH 48 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.577515 | -74.012207 | (40.57751513866536, -74.01220705163807) |
26429040 | 10/04/2013 12:04:52 AM | 10/04/2013 03:01:04 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10003 | 99 2 AVENUE | 2 AVENUE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.727251 | -73.988660 | (40.727251287038676, -73.98866028603422) |
26428987 | 10/04/2013 12:00:45 AM | 10/04/2013 01:25:01 AM | NYPD | New York City Police Department | Noise - Street/Sidewalk | Loud Talking | Street/Sidewalk | 10016 | 344 EAST 28 STREET | EAST 28 STREET | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 40.740295 | -73.976952 | (40.740295354643706, -73.97695165980414) |
1928 rows × 51 columns
noise_complaints.shape
(1928, 51)
We can combine multiple conditions with & and |
is_brooklyn = complaints['Borough'] == "BROOKLYN"
noise_complaints = complaints[is_noise & is_brooklyn]
noise_complaints.shape
(456, 51)
--> Working with text data: http://pandas.pydata.org/pandas-docs/stable/text.html
Now, it turns out that, in this dataset, the "Complaint Type" column can contain many different types of noise complaints:
complaints['Complaint Type'].unique()
array(['Noise - Street/Sidewalk', 'Illegal Parking', 'Noise - Commercial', 'Noise - Vehicle', 'Rodent', 'Blocked Driveway', 'Noise - House of Worship', 'Street Light Condition', 'Harboring Bees/Wasps', 'Taxi Complaint', 'Homeless Encampment', 'Traffic Signal Condition', 'Food Establishment', 'Noise - Park', 'Broken Muni Meter', 'Benefit Card Replacement', 'Sanitation Condition', 'ELECTRIC', 'PLUMBING', 'HEATING', 'GENERAL CONSTRUCTION', 'Street Condition', 'Consumer Complaint', 'Derelict Vehicles', 'Noise', 'Drinking', 'Indoor Air Quality', 'Panhandling', 'Derelict Vehicle', 'Lead', 'Water System', 'Noise - Helicopter', 'Homeless Person Assistance', 'Root/Sewer/Sidewalk Condition', 'Sidewalk Condition', 'Graffiti', 'DOF Literature Request', 'Animal in a Park', 'Overgrown Tree/Branches', 'Air Quality', 'Dirty Conditions', 'Water Quality', 'Other Enforcement', 'Collection Truck Noise', 'Illegal Tree Damage', 'Litter Basket / Request', 'Violation of Park Rules', 'Maintenance or Facility', 'Sewer', 'Agency Issues', 'Beach/Pool/Sauna Complaint', 'Bike/Roller/Skate Chronic', 'School Maintenance', 'Animal Abuse', 'Water Conservation', 'Damaged Tree', 'Street Sign - Damaged', 'Taxi Compliment', 'Overflowing Litter Baskets', 'Asbestos', 'Indoor Sewage', 'Hazardous Materials', 'Dead Tree', 'Smoking', 'Missed Collection (All Materials)', 'Standing Water', 'Sweeping/Missed-Inadequate', 'DCA / DOH New License Application Request', 'Ferry Inquiry', 'For Hire Vehicle Complaint', 'Traffic', 'Plumbing', 'Building/Use', 'SCRIE', 'Boilers', 'General Construction/Plumbing', 'Construction', 'Special Projects Inspection Team (SPIT)', 'Elevator', 'DOT Literature Request', 'Fire Safety Director - F58', 'Highway Condition', 'Street Sign - Missing', 'Investigations and Discipline (IAD)', 'Non-Residential Heat', 'Electrical', 'DOF Parking - Tax Exemption', 'Special Enforcement', 'Street Sign - Dangling', 'Internal Code', 'Unsanitary Animal Pvt Property', 'Derelict Bicycle', 'Industrial Waste', 'Vacant Lot', 'Food Poisoning', 'Found Property', 'Bridge Condition', 'Bike Rack Condition', 'BEST/Site Safety', 'EAP Inspection - F59', 'Vending', 'Public Assembly', 'Public Payphone Complaint', 'Unleashed Dog', 'Curb Condition', 'Cranes and Derricks', 'DCA Literature Request', 'City Vehicle Placard Complaint', 'Legal Services Provider Complaint', 'Posting Advertisement', 'PAINT - PLASTER', 'NONCONST', 'APPLIANCE', 'CONSTRUCTION', 'Recycling Enforcement', 'Special Natural Area District (SNAD)', 'DPR Internal', 'Opinion for the Mayor', 'Mold', 'Snow', 'Complaint', 'Urinating in Public', 'Senior Center Complaint', 'Animal Facility - No Permit', 'Unsanitary Animal Facility', 'Parking Card', 'Fire Alarm - Reinspection', 'Request for Information', 'Miscellaneous Categories', 'Unsanitary Pigeon Condition', 'Plant', 'Fire Alarm - Replacement', 'DPR Literature Request', 'Invitation', 'Scaffold Safety', 'Transportation Provider Complaint', 'Ferry Complaint', 'STRUCTURAL', 'Bus Stop Shelter Placement', 'Compliment', 'Misc. Comments', 'Disorderly Youth', 'OEM Literature Request', 'Fire Alarm - New System', 'Fire Alarm - Modification', 'Stalled Sites', 'Public Toilet', 'Tattooing', 'Illegal Animal Sold', 'Illegal Animal Kept as Pet', 'DFTA Literature Request', 'Adopt-A-Basket', 'DHS Income Savings Requirement', 'Poison Ivy', 'Tunnel Condition', 'Drinking Water', 'Municipal Parking Facility', 'Trans Fat', 'Illegal Fireworks', 'Window Guard', 'Open Flame Permit', 'DWD', 'Highway Sign - Damaged', 'Ferry Permit', 'X-Ray Machine/Equipment'], dtype=object)
If a column contains text data, we can manipulate it (and filter by it) by accessing the str
attribute of the Series:
ctypes = complaints['Complaint Type']
ctypes[ctypes.str.contains("Noise")].unique()
array(['Noise - Street/Sidewalk', 'Noise - Commercial', 'Noise - Vehicle', 'Noise - House of Worship', 'Noise - Park', 'Noise', 'Noise - Helicopter', 'Collection Truck Noise'], dtype=object)
is_all_noise = complaints['Complaint Type'].str.contains("Noise")
in_brooklyn = complaints['Borough'] == "BROOKLYN"
brooklyn_noise = complaints[is_all_noise & in_brooklyn]
brooklyn_noise.shape
(2344, 51)
brooklyn_noise['Complaint Type'].value_counts()
Noise - Commercial 775 Noise 767 Noise - Street/Sidewalk 456 Noise - Vehicle 237 Noise - Park 60 Noise - House of Worship 23 Noise - Helicopter 23 Collection Truck Noise 3 Name: Complaint Type, dtype: int64
import matplotlib.pyplot as plt
for ctype, count in ctcounts[:10].iteritems():
df = complaints[complaints['Complaint Type'] == ctype]
df.plot(kind='scatter', x='Longitude', y='Latitude', subplots=True, label=ctype, marker=".")
--> Group By: http://pandas.pydata.org/pandas-docs/stable/groupby.html
To do this task, we need to split the data into groups. First off, we need to separate the data by boroughs. We can do this with the groupby
method. This returns a special DataFrameGroupBy
object which we can do some operations on. For example, we can call its size
method to obtain a Series
where the row label is the column we're grouping by, and the values is the number of rows in that group.
complaints.groupby('Borough').size()
Borough BRONX 19686 BROOKLYN 32890 MANHATTAN 24288 QUEENS 22281 STATEN ISLAND 4817 Unspecified 7107 dtype: int64
We can also group by multiple columns:
gb = complaints.groupby(['Borough', 'Complaint Type'])
gb.size()
Borough Complaint Type BRONX APPLIANCE 108 Air Quality 40 Animal Abuse 162 Animal in a Park 17 Asbestos 17 BEST/Site Safety 12 Bike/Roller/Skate Chronic 2 Blocked Driveway 656 Boilers 17 Bridge Condition 4 Broken Muni Meter 121 Building/Use 102 Bus Stop Shelter Placement 1 CONSTRUCTION 18 Construction 2 Consumer Complaint 155 Curb Condition 11 DPR Internal 10 Damaged Tree 105 Dead Tree 55 Derelict Vehicle 80 Derelict Vehicles 83 Dirty Conditions 475 Disorderly Youth 5 Drinking 17 Drinking Water 2 EAP Inspection - F59 1 ELECTRIC 619 Electrical 6 Elevator 75 ... Unspecified Ferry Complaint 4 Ferry Inquiry 32 Ferry Permit 1 For Hire Vehicle Complaint 3 Found Property 1 Highway Condition 3 Invitation 3 Legal Services Provider Complaint 2 Maintenance or Facility 4 Misc. Comments 2 Noise 5 Noise - Street/Sidewalk 1 Noise - Vehicle 1 OEM Literature Request 29 Opinion for the Mayor 2 Parking Card 8 Request for Information 18 SCRIE 9 Sanitation Condition 3 School Maintenance 8 Senior Center Complaint 2 Sewer 4 Street Condition 6 Street Light Condition 112 Taxi Complaint 28 Taxi Compliment 73 Traffic Signal Condition 10 Transportation Provider Complaint 1 Vacant Lot 3 Water System 2 dtype: int64
This may look like a DataFrame (because it has multiple columns), but it's actually a Series with a hierarchical index. To work on it like a DataFrame, we need to convert it into one:
gb.size().to_frame()
0 | ||
---|---|---|
Borough | Complaint Type | |
BRONX | APPLIANCE | 108 |
Air Quality | 40 | |
Animal Abuse | 162 | |
Animal in a Park | 17 | |
Asbestos | 17 | |
BEST/Site Safety | 12 | |
Bike/Roller/Skate Chronic | 2 | |
Blocked Driveway | 656 | |
Boilers | 17 | |
Bridge Condition | 4 | |
Broken Muni Meter | 121 | |
Building/Use | 102 | |
Bus Stop Shelter Placement | 1 | |
CONSTRUCTION | 18 | |
Construction | 2 | |
Consumer Complaint | 155 | |
Curb Condition | 11 | |
DPR Internal | 10 | |
Damaged Tree | 105 | |
Dead Tree | 55 | |
Derelict Vehicle | 80 | |
Derelict Vehicles | 83 | |
Dirty Conditions | 475 | |
Disorderly Youth | 5 | |
Drinking | 17 | |
Drinking Water | 2 | |
EAP Inspection - F59 | 1 | |
ELECTRIC | 619 | |
Electrical | 6 | |
Elevator | 75 | |
... | ... | ... |
Unspecified | Ferry Complaint | 4 |
Ferry Inquiry | 32 | |
Ferry Permit | 1 | |
For Hire Vehicle Complaint | 3 | |
Found Property | 1 | |
Highway Condition | 3 | |
Invitation | 3 | |
Legal Services Provider Complaint | 2 | |
Maintenance or Facility | 4 | |
Misc. Comments | 2 | |
Noise | 5 | |
Noise - Street/Sidewalk | 1 | |
Noise - Vehicle | 1 | |
OEM Literature Request | 29 | |
Opinion for the Mayor | 2 | |
Parking Card | 8 | |
Request for Information | 18 | |
SCRIE | 9 | |
Sanitation Condition | 3 | |
School Maintenance | 8 | |
Senior Center Complaint | 2 | |
Sewer | 4 | |
Street Condition | 6 | |
Street Light Condition | 112 | |
Taxi Complaint | 28 | |
Taxi Compliment | 73 | |
Traffic Signal Condition | 10 | |
Transportation Provider Complaint | 1 | |
Vacant Lot | 3 | |
Water System | 2 |
638 rows × 1 columns
We can also 'flatten' the hierarchical index using reset_index
:
counts = gb.size().to_frame().reset_index()
counts[:5]
Borough | Complaint Type | Count | |
---|---|---|---|
0 | BRONX | APPLIANCE | 108 |
1 | BRONX | Air Quality | 40 |
2 | BRONX | Animal Abuse | 162 |
3 | BRONX | Animal in a Park | 17 |
4 | BRONX | Asbestos | 17 |
We have a column that has been named "0" by default. We can do one of two things. We can either explicitly rename the columns:
counts = counts.rename(columns={0 : "Count"})
counts[:5]
Borough | Complaint Type | Count | |
---|---|---|---|
0 | BRONX | APPLIANCE | 108 |
1 | BRONX | Air Quality | 40 |
2 | BRONX | Animal Abuse | 162 |
3 | BRONX | Animal in a Park | 17 |
4 | BRONX | Asbestos | 17 |
Or we can just specify the column name when calling to_frame
:
counts = gb.size().to_frame("Count").reset_index()
counts[:5]
Borough | Complaint Type | Count | |
---|---|---|---|
0 | BRONX | APPLIANCE | 108 |
1 | BRONX | Air Quality | 40 |
2 | BRONX | Animal Abuse | 162 |
3 | BRONX | Animal in a Park | 17 |
4 | BRONX | Asbestos | 17 |
To compute the top complaint in each borough, we could just take the information we have in the above dataframe, and use regular Python.
First of all, we can iterate over the dataframe like this:
for i, row in counts.iterrows():
print(i, row["Borough"], row["Complaint Type"], row["Count"])
0 BRONX APPLIANCE 108 1 BRONX Air Quality 40 2 BRONX Animal Abuse 162 3 BRONX Animal in a Park 17 4 BRONX Asbestos 17 5 BRONX BEST/Site Safety 12 6 BRONX Bike/Roller/Skate Chronic 2 7 BRONX Blocked Driveway 656 8 BRONX Boilers 17 9 BRONX Bridge Condition 4 10 BRONX Broken Muni Meter 121 11 BRONX Building/Use 102 12 BRONX Bus Stop Shelter Placement 1 13 BRONX CONSTRUCTION 18 14 BRONX Construction 2 15 BRONX Consumer Complaint 155 16 BRONX Curb Condition 11 17 BRONX DPR Internal 10 18 BRONX Damaged Tree 105 19 BRONX Dead Tree 55 20 BRONX Derelict Vehicle 80 21 BRONX Derelict Vehicles 83 22 BRONX Dirty Conditions 475 23 BRONX Disorderly Youth 5 24 BRONX Drinking 17 25 BRONX Drinking Water 2 26 BRONX EAP Inspection - F59 1 27 BRONX ELECTRIC 619 28 BRONX Electrical 6 29 BRONX Elevator 75 30 BRONX Fire Alarm - New System 1 31 BRONX Fire Alarm - Reinspection 1 32 BRONX Fire Safety Director - F58 2 33 BRONX Food Establishment 44 34 BRONX Food Poisoning 21 35 BRONX For Hire Vehicle Complaint 28 36 BRONX GENERAL CONSTRUCTION 2273 37 BRONX General Construction/Plumbing 93 38 BRONX Graffiti 170 39 BRONX HEATING 4093 40 BRONX Harboring Bees/Wasps 2 41 BRONX Hazardous Materials 20 42 BRONX Highway Condition 15 43 BRONX Highway Sign - Damaged 1 44 BRONX Homeless Encampment 17 45 BRONX Homeless Person Assistance 6 46 BRONX Illegal Animal Kept as Pet 2 47 BRONX Illegal Fireworks 2 48 BRONX Illegal Parking 391 49 BRONX Illegal Tree Damage 13 50 BRONX Indoor Air Quality 23 51 BRONX Indoor Sewage 14 52 BRONX Industrial Waste 37 53 BRONX Internal Code 1 54 BRONX Investigations and Discipline (IAD) 5 55 BRONX Lead 8 56 BRONX Litter Basket / Request 11 57 BRONX Maintenance or Facility 42 58 BRONX Miscellaneous Categories 1 59 BRONX Missed Collection (All Materials) 61 60 BRONX Mold 3 61 BRONX NONCONST 1134 62 BRONX Noise 168 63 BRONX Noise - Commercial 136 64 BRONX Noise - Helicopter 1 65 BRONX Noise - House of Worship 11 66 BRONX Noise - Park 9 67 BRONX Noise - Street/Sidewalk 292 68 BRONX Noise - Vehicle 102 69 BRONX Non-Residential Heat 28 70 BRONX Other Enforcement 19 71 BRONX Overflowing Litter Baskets 8 72 BRONX Overgrown Tree/Branches 123 73 BRONX PAINT - PLASTER 1653 74 BRONX PLUMBING 1591 75 BRONX Plant 1 76 BRONX Plumbing 14 77 BRONX Poison Ivy 1 78 BRONX Posting Advertisement 1 79 BRONX Public Assembly 1 80 BRONX Public Payphone Complaint 3 81 BRONX Recycling Enforcement 12 82 BRONX Rodent 456 83 BRONX Root/Sewer/Sidewalk Condition 56 84 BRONX SCRIE 78 85 BRONX STRUCTURAL 4 86 BRONX Sanitation Condition 227 87 BRONX Scaffold Safety 2 88 BRONX School Maintenance 17 89 BRONX Senior Center Complaint 3 90 BRONX Sewer 201 91 BRONX Sidewalk Condition 22 92 BRONX Smoking 12 93 BRONX Special Enforcement 18 94 BRONX Special Natural Area District (SNAD) 1 95 BRONX Special Projects Inspection Team (SPIT) 10 96 BRONX Standing Water 4 97 BRONX Street Condition 379 98 BRONX Street Light Condition 1642 99 BRONX Street Sign - Damaged 17 100 BRONX Street Sign - Dangling 12 101 BRONX Street Sign - Missing 51 102 BRONX Sweeping/Missed-Inadequate 12 103 BRONX Tattooing 1 104 BRONX Taxi Complaint 19 105 BRONX Traffic 21 106 BRONX Traffic Signal Condition 329 107 BRONX Transportation Provider Complaint 1 108 BRONX Unleashed Dog 7 109 BRONX Unsanitary Animal Pvt Property 21 110 BRONX Unsanitary Pigeon Condition 8 111 BRONX Urinating in Public 2 112 BRONX Vacant Lot 20 113 BRONX Vending 23 114 BRONX Violation of Park Rules 17 115 BRONX Water Conservation 15 116 BRONX Water Quality 13 117 BRONX Water System 278 118 BROOKLYN APPLIANCE 175 119 BROOKLYN Air Quality 125 120 BROOKLYN Animal Facility - No Permit 4 121 BROOKLYN Animal in a Park 15 122 BROOKLYN Asbestos 29 123 BROOKLYN BEST/Site Safety 16 124 BROOKLYN Beach/Pool/Sauna Complaint 6 125 BROOKLYN Bike Rack Condition 3 126 BROOKLYN Bike/Roller/Skate Chronic 3 127 BROOKLYN Blocked Driveway 1776 128 BROOKLYN Boilers 26 129 BROOKLYN Bridge Condition 9 130 BROOKLYN Broken Muni Meter 252 131 BROOKLYN Building/Use 256 132 BROOKLYN Bus Stop Shelter Placement 3 133 BROOKLYN CONSTRUCTION 28 134 BROOKLYN City Vehicle Placard Complaint 8 135 BROOKLYN Collection Truck Noise 3 136 BROOKLYN Construction 4 137 BROOKLYN Consumer Complaint 357 138 BROOKLYN Curb Condition 27 139 BROOKLYN DPR Internal 12 140 BROOKLYN Damaged Tree 300 141 BROOKLYN Dead Tree 156 142 BROOKLYN Derelict Bicycle 37 143 BROOKLYN Derelict Vehicle 226 144 BROOKLYN Derelict Vehicles 354 145 BROOKLYN Dirty Conditions 680 146 BROOKLYN Disorderly Youth 8 147 BROOKLYN Drinking 18 148 BROOKLYN Drinking Water 3 149 BROOKLYN EAP Inspection - F59 2 150 BROOKLYN ELECTRIC 987 151 BROOKLYN Electrical 27 152 BROOKLYN Elevator 50 153 BROOKLYN Fire Alarm - New System 1 154 BROOKLYN Fire Alarm - Reinspection 3 155 BROOKLYN Fire Safety Director - F58 10 156 BROOKLYN Food Establishment 117 157 BROOKLYN Food Poisoning 38 158 BROOKLYN For Hire Vehicle Complaint 42 159 BROOKLYN Found Property 2 160 BROOKLYN GENERAL CONSTRUCTION 2791 161 BROOKLYN General Construction/Plumbing 268 162 BROOKLYN Graffiti 468 163 BROOKLYN HEATING 4548 164 BROOKLYN Harboring Bees/Wasps 3 165 BROOKLYN Hazardous Materials 45 166 BROOKLYN Highway Condition 32 167 BROOKLYN Homeless Encampment 59 168 BROOKLYN Homeless Person Assistance 31 169 BROOKLYN Illegal Animal Kept as Pet 5 170 BROOKLYN Illegal Animal Sold 2 171 BROOKLYN Illegal Fireworks 1 172 BROOKLYN Illegal Parking 1147 173 BROOKLYN Illegal Tree Damage 44 174 BROOKLYN Indoor Air Quality 65 175 BROOKLYN Indoor Sewage 48 176 BROOKLYN Industrial Waste 47 177 BROOKLYN Investigations and Discipline (IAD) 12 178 BROOKLYN Lead 68 179 BROOKLYN Litter Basket / Request 92 180 BROOKLYN Maintenance or Facility 167 181 BROOKLYN Missed Collection (All Materials) 299 182 BROOKLYN Mold 9 183 BROOKLYN NONCONST 1500 184 BROOKLYN Noise 767 185 BROOKLYN Noise - Commercial 775 186 BROOKLYN Noise - Helicopter 23 187 BROOKLYN Noise - House of Worship 23 188 BROOKLYN Noise - Park 60 189 BROOKLYN Noise - Street/Sidewalk 456 190 BROOKLYN Noise - Vehicle 237 191 BROOKLYN Non-Residential Heat 24 192 BROOKLYN Other Enforcement 128 193 BROOKLYN Overflowing Litter Baskets 28 194 BROOKLYN Overgrown Tree/Branches 329 195 BROOKLYN PAINT - PLASTER 1821 196 BROOKLYN PLUMBING 2017 197 BROOKLYN Panhandling 1 198 BROOKLYN Plant 14 199 BROOKLYN Plumbing 21 200 BROOKLYN Poison Ivy 1 201 BROOKLYN Posting Advertisement 2 202 BROOKLYN Public Assembly 1 203 BROOKLYN Public Payphone Complaint 8 204 BROOKLYN Public Toilet 2 205 BROOKLYN Recycling Enforcement 15 206 BROOKLYN Rodent 467 207 BROOKLYN Root/Sewer/Sidewalk Condition 261 208 BROOKLYN SCRIE 149 209 BROOKLYN STRUCTURAL 4 210 BROOKLYN Sanitation Condition 716 211 BROOKLYN Scaffold Safety 1 212 BROOKLYN School Maintenance 26 213 BROOKLYN Senior Center Complaint 4 214 BROOKLYN Sewer 404 215 BROOKLYN Sidewalk Condition 108 216 BROOKLYN Smoking 26 217 BROOKLYN Special Enforcement 69 218 BROOKLYN Special Projects Inspection Team (SPIT) 25 219 BROOKLYN Standing Water 16 220 BROOKLYN Street Condition 1012 221 BROOKLYN Street Light Condition 1956 222 BROOKLYN Street Sign - Damaged 520 223 BROOKLYN Street Sign - Dangling 40 224 BROOKLYN Street Sign - Missing 153 225 BROOKLYN Sweeping/Missed-Inadequate 86 226 BROOKLYN Tattooing 1 227 BROOKLYN Taxi Complaint 159 228 BROOKLYN Traffic 46 229 BROOKLYN Traffic Signal Condition 1196 230 BROOKLYN Transportation Provider Complaint 1 231 BROOKLYN Unleashed Dog 6 232 BROOKLYN Unsanitary Animal Facility 4 233 BROOKLYN Unsanitary Animal Pvt Property 29 234 BROOKLYN Unsanitary Pigeon Condition 10 235 BROOKLYN Urinating in Public 12 236 BROOKLYN Vacant Lot 61 237 BROOKLYN Vending 35 238 BROOKLYN Violation of Park Rules 8 239 BROOKLYN Water Conservation 31 240 BROOKLYN Water Quality 28 241 BROOKLYN Water System 517 242 BROOKLYN Window Guard 1 243 MANHATTAN APPLIANCE 89 244 MANHATTAN Adopt-A-Basket 3 245 MANHATTAN Air Quality 203 246 MANHATTAN Animal Abuse 2 247 MANHATTAN Animal Facility - No Permit 2 248 MANHATTAN Animal in a Park 23 249 MANHATTAN Asbestos 60 250 MANHATTAN BEST/Site Safety 18 251 MANHATTAN Bike Rack Condition 4 252 MANHATTAN Bike/Roller/Skate Chronic 24 253 MANHATTAN Blocked Driveway 157 254 MANHATTAN Boilers 24 255 MANHATTAN Bridge Condition 3 256 MANHATTAN Broken Muni Meter 1312 257 MANHATTAN Building/Use 71 258 MANHATTAN Bus Stop Shelter Placement 5 259 MANHATTAN CONSTRUCTION 9 260 MANHATTAN City Vehicle Placard Complaint 11 261 MANHATTAN Collection Truck Noise 4 262 MANHATTAN Construction 8 263 MANHATTAN Consumer Complaint 358 264 MANHATTAN Cranes and Derricks 9 265 MANHATTAN Curb Condition 1 266 MANHATTAN DOF Parking - Tax Exemption 43 267 MANHATTAN DPR Internal 43 268 MANHATTAN Damaged Tree 54 269 MANHATTAN Dead Tree 24 270 MANHATTAN Derelict Bicycle 34 271 MANHATTAN Derelict Vehicle 29 272 MANHATTAN Derelict Vehicles 28 273 MANHATTAN Dirty Conditions 404 274 MANHATTAN Disorderly Youth 4 275 MANHATTAN Drinking 21 276 MANHATTAN Drinking Water 2 277 MANHATTAN EAP Inspection - F59 120 278 MANHATTAN ELECTRIC 333 279 MANHATTAN Electrical 28 280 MANHATTAN Elevator 70 281 MANHATTAN Fire Alarm - Modification 5 282 MANHATTAN Fire Alarm - Reinspection 7 283 MANHATTAN Fire Alarm - Replacement 2 284 MANHATTAN Fire Safety Director - F58 382 285 MANHATTAN Food Establishment 196 286 MANHATTAN Food Poisoning 56 287 MANHATTAN For Hire Vehicle Complaint 52 288 MANHATTAN Found Property 15 289 MANHATTAN GENERAL CONSTRUCTION 1333 290 MANHATTAN General Construction/Plumbing 157 291 MANHATTAN Graffiti 185 292 MANHATTAN HEATING 3494 293 MANHATTAN Harboring Bees/Wasps 2 294 MANHATTAN Hazardous Materials 58 295 MANHATTAN Highway Condition 4 296 MANHATTAN Homeless Encampment 170 297 MANHATTAN Homeless Person Assistance 54 298 MANHATTAN Illegal Animal Kept as Pet 2 299 MANHATTAN Illegal Parking 640 300 MANHATTAN Illegal Tree Damage 37 301 MANHATTAN Indoor Air Quality 122 302 MANHATTAN Indoor Sewage 16 303 MANHATTAN Industrial Waste 27 304 MANHATTAN Internal Code 2 305 MANHATTAN Investigations and Discipline (IAD) 9 306 MANHATTAN Lead 48 307 MANHATTAN Litter Basket / Request 68 308 MANHATTAN Maintenance or Facility 129 309 MANHATTAN Miscellaneous Categories 4 310 MANHATTAN Missed Collection (All Materials) 116 311 MANHATTAN Mold 6 312 MANHATTAN NONCONST 673 313 MANHATTAN Noise 1848 314 MANHATTAN Noise - Commercial 1140 315 MANHATTAN Noise - Helicopter 66 316 MANHATTAN Noise - House of Worship 16 317 MANHATTAN Noise - Park 91 318 MANHATTAN Noise - Street/Sidewalk 917 319 MANHATTAN Noise - Vehicle 255 320 MANHATTAN Non-Residential Heat 36 321 MANHATTAN Open Flame Permit 1 322 MANHATTAN Other Enforcement 91 323 MANHATTAN Overflowing Litter Baskets 32 324 MANHATTAN Overgrown Tree/Branches 36 325 MANHATTAN PAINT - PLASTER 971 326 MANHATTAN PLUMBING 961 327 MANHATTAN Panhandling 6 328 MANHATTAN Plant 3 329 MANHATTAN Plumbing 8 330 MANHATTAN Posting Advertisement 1 331 MANHATTAN Public Payphone Complaint 15 332 MANHATTAN Public Toilet 1 333 MANHATTAN Recycling Enforcement 15 334 MANHATTAN Rodent 372 335 MANHATTAN Root/Sewer/Sidewalk Condition 15 336 MANHATTAN SCRIE 172 337 MANHATTAN STRUCTURAL 3 338 MANHATTAN Sanitation Condition 169 339 MANHATTAN Scaffold Safety 4 340 MANHATTAN School Maintenance 24 341 MANHATTAN Senior Center Complaint 3 342 MANHATTAN Sewer 172 343 MANHATTAN Sidewalk Condition 118 344 MANHATTAN Smoking 53 345 MANHATTAN Special Enforcement 29 346 MANHATTAN Special Projects Inspection Team (SPIT) 38 347 MANHATTAN Standing Water 10 348 MANHATTAN Street Condition 834 349 MANHATTAN Street Light Condition 836 350 MANHATTAN Street Sign - Damaged 38 351 MANHATTAN Street Sign - Dangling 9 352 MANHATTAN Street Sign - Missing 28 353 MANHATTAN Sweeping/Missed-Inadequate 12 354 MANHATTAN Taxi Complaint 870 355 MANHATTAN Traffic 53 356 MANHATTAN Traffic Signal Condition 988 357 MANHATTAN Tunnel Condition 1 358 MANHATTAN Unleashed Dog 4 359 MANHATTAN Unsanitary Animal Pvt Property 12 360 MANHATTAN Unsanitary Pigeon Condition 11 361 MANHATTAN Urinating in Public 10 362 MANHATTAN Vacant Lot 8 363 MANHATTAN Vending 119 364 MANHATTAN Violation of Park Rules 46 365 MANHATTAN Water Conservation 21 366 MANHATTAN Water Quality 26 367 MANHATTAN Water System 961 368 MANHATTAN Window Guard 1 369 QUEENS APPLIANCE 67 370 QUEENS Air Quality 76 371 QUEENS Animal in a Park 16 372 QUEENS Asbestos 30 373 QUEENS BEST/Site Safety 5 374 QUEENS Beach/Pool/Sauna Complaint 1 375 QUEENS Bike/Roller/Skate Chronic 3 376 QUEENS Blocked Driveway 1890 377 QUEENS Boilers 17 378 QUEENS Bridge Condition 4 379 QUEENS Broken Muni Meter 361 380 QUEENS Building/Use 452 381 QUEENS Bus Stop Shelter Placement 5 382 QUEENS CONSTRUCTION 6 383 QUEENS City Vehicle Placard Complaint 12 384 QUEENS Collection Truck Noise 1 385 QUEENS Construction 406 386 QUEENS Consumer Complaint 284 387 QUEENS Cranes and Derricks 1 388 QUEENS Curb Condition 25 389 QUEENS DPR Internal 15 390 QUEENS DWD 1 391 QUEENS Damaged Tree 598 392 QUEENS Dead Tree 331 393 QUEENS Derelict Bicycle 11 394 QUEENS Derelict Vehicle 366 395 QUEENS Derelict Vehicles 317 396 QUEENS Dirty Conditions 991 397 QUEENS Disorderly Youth 3 398 QUEENS Drinking 25 399 QUEENS Drinking Water 4 400 QUEENS EAP Inspection - F59 3 401 QUEENS ELECTRIC 343 402 QUEENS Electrical 21 403 QUEENS Elevator 44 404 QUEENS Fire Alarm - New System 4 405 QUEENS Fire Alarm - Reinspection 6 406 QUEENS Fire Alarm - Replacement 1 407 QUEENS Fire Safety Director - F58 38 408 QUEENS Food Establishment 110 409 QUEENS Food Poisoning 31 410 QUEENS For Hire Vehicle Complaint 38 411 QUEENS GENERAL CONSTRUCTION 850 412 QUEENS General Construction/Plumbing 167 413 QUEENS Graffiti 133 414 QUEENS HEATING 1896 415 QUEENS Harboring Bees/Wasps 5 416 QUEENS Hazardous Materials 43 417 QUEENS Highway Condition 68 418 QUEENS Homeless Encampment 21 419 QUEENS Homeless Person Assistance 8 420 QUEENS Illegal Animal Kept as Pet 10 421 QUEENS Illegal Animal Sold 1 422 QUEENS Illegal Parking 887 423 QUEENS Illegal Tree Damage 50 424 QUEENS Indoor Air Quality 35 425 QUEENS Indoor Sewage 22 426 QUEENS Industrial Waste 25 427 QUEENS Internal Code 2 428 QUEENS Investigations and Discipline (IAD) 10 429 QUEENS Lead 39 430 QUEENS Litter Basket / Request 61 431 QUEENS Maintenance or Facility 90 432 QUEENS Missed Collection (All Materials) 319 433 QUEENS Mold 3 434 QUEENS Municipal Parking Facility 1 435 QUEENS NONCONST 575 436 QUEENS Noise 418 437 QUEENS Noise - Commercial 451 438 QUEENS Noise - Helicopter 9 439 QUEENS Noise - House of Worship 15 440 QUEENS Noise - Park 27 441 QUEENS Noise - Street/Sidewalk 226 442 QUEENS Noise - Vehicle 130 443 QUEENS Non-Residential Heat 17 444 QUEENS Other Enforcement 93 445 QUEENS Overflowing Litter Baskets 17 446 QUEENS Overgrown Tree/Branches 461 447 QUEENS PAINT - PLASTER 580 448 QUEENS PLUMBING 665 449 QUEENS Panhandling 16 450 QUEENS Plant 16 451 QUEENS Plumbing 22 452 QUEENS Posting Advertisement 1 453 QUEENS Public Payphone Complaint 2 454 QUEENS Public Toilet 3 455 QUEENS Recycling Enforcement 8 456 QUEENS Rodent 250 457 QUEENS Root/Sewer/Sidewalk Condition 360 458 QUEENS SCRIE 116 459 QUEENS STRUCTURAL 4 460 QUEENS Sanitation Condition 549 461 QUEENS Scaffold Safety 1 462 QUEENS School Maintenance 15 463 QUEENS Senior Center Complaint 2 464 QUEENS Sewer 656 465 QUEENS Sidewalk Condition 75 466 QUEENS Smoking 21 467 QUEENS Special Enforcement 79 468 QUEENS Special Projects Inspection Team (SPIT) 31 469 QUEENS Stalled Sites 1 470 QUEENS Standing Water 17 471 QUEENS Street Condition 797 472 QUEENS Street Light Condition 2122 473 QUEENS Street Sign - Damaged 73 474 QUEENS Street Sign - Dangling 32 475 QUEENS Street Sign - Missing 77 476 QUEENS Sweeping/Missed-Inadequate 54 477 QUEENS Tattooing 3 478 QUEENS Taxi Complaint 150 479 QUEENS Traffic 34 480 QUEENS Traffic Signal Condition 521 481 QUEENS Trans Fat 1 482 QUEENS Transportation Provider Complaint 1 483 QUEENS Unleashed Dog 13 484 QUEENS Unsanitary Animal Facility 2 485 QUEENS Unsanitary Animal Pvt Property 36 486 QUEENS Unsanitary Pigeon Condition 12 487 QUEENS Urinating in Public 6 488 QUEENS Vacant Lot 29 489 QUEENS Vending 49 490 QUEENS Violation of Park Rules 13 491 QUEENS Water Conservation 53 492 QUEENS Water Quality 12 493 QUEENS Water System 552 494 QUEENS X-Ray Machine/Equipment 1 495 STATEN ISLAND APPLIANCE 6 496 STATEN ISLAND Air Quality 15 497 STATEN ISLAND Animal Facility - No Permit 2 498 STATEN ISLAND Animal in a Park 13 499 STATEN ISLAND Asbestos 7 500 STATEN ISLAND BEST/Site Safety 1 501 STATEN ISLAND Blocked Driveway 111 502 STATEN ISLAND Boilers 2 503 STATEN ISLAND Broken Muni Meter 24 504 STATEN ISLAND Building/Use 61 505 STATEN ISLAND CONSTRUCTION 4 506 STATEN ISLAND City Vehicle Placard Complaint 1 507 STATEN ISLAND Consumer Complaint 36 508 STATEN ISLAND Curb Condition 2 509 STATEN ISLAND DPR Internal 6 510 STATEN ISLAND Damaged Tree 123 511 STATEN ISLAND Dead Tree 64 512 STATEN ISLAND Derelict Bicycle 1 513 STATEN ISLAND Derelict Vehicle 102 514 STATEN ISLAND Derelict Vehicles 19 515 STATEN ISLAND Dirty Conditions 100 516 STATEN ISLAND Disorderly Youth 6 517 STATEN ISLAND Drinking 2 518 STATEN ISLAND ELECTRIC 68 519 STATEN ISLAND Electrical 4 520 STATEN ISLAND Elevator 5 521 STATEN ISLAND Fire Alarm - Reinspection 1 522 STATEN ISLAND Fire Safety Director - F58 9 523 STATEN ISLAND Food Establishment 18 524 STATEN ISLAND Food Poisoning 6 525 STATEN ISLAND For Hire Vehicle Complaint 10 526 STATEN ISLAND GENERAL CONSTRUCTION 224 527 STATEN ISLAND General Construction/Plumbing 53 528 STATEN ISLAND Graffiti 17 529 STATEN ISLAND HEATING 169 530 STATEN ISLAND Harboring Bees/Wasps 2 531 STATEN ISLAND Hazardous Materials 5 532 STATEN ISLAND Highway Condition 8 533 STATEN ISLAND Homeless Encampment 2 534 STATEN ISLAND Illegal Animal Kept as Pet 2 535 STATEN ISLAND Illegal Animal Sold 1 536 STATEN ISLAND Illegal Parking 278 537 STATEN ISLAND Illegal Tree Damage 19 538 STATEN ISLAND Indoor Air Quality 9 539 STATEN ISLAND Indoor Sewage 4 540 STATEN ISLAND Industrial Waste 9 541 STATEN ISLAND Investigations and Discipline (IAD) 4 542 STATEN ISLAND Lead 4 543 STATEN ISLAND Litter Basket / Request 20 544 STATEN ISLAND Maintenance or Facility 23 545 STATEN ISLAND Missed Collection (All Materials) 178 546 STATEN ISLAND Mold 5 547 STATEN ISLAND NONCONST 116 548 STATEN ISLAND Noise 115 549 STATEN ISLAND Noise - Commercial 76 550 STATEN ISLAND Noise - House of Worship 2 551 STATEN ISLAND Noise - Park 4 552 STATEN ISLAND Noise - Street/Sidewalk 36 553 STATEN ISLAND Noise - Vehicle 25 554 STATEN ISLAND Non-Residential Heat 1 555 STATEN ISLAND Other Enforcement 27 556 STATEN ISLAND Overflowing Litter Baskets 11 557 STATEN ISLAND Overgrown Tree/Branches 134 558 STATEN ISLAND PAINT - PLASTER 124 559 STATEN ISLAND PLUMBING 139 560 STATEN ISLAND Plumbing 1 561 STATEN ISLAND Poison Ivy 3 562 STATEN ISLAND Public Payphone Complaint 3 563 STATEN ISLAND Recycling Enforcement 6 564 STATEN ISLAND Rodent 87 565 STATEN ISLAND Root/Sewer/Sidewalk Condition 144 566 STATEN ISLAND SCRIE 6 567 STATEN ISLAND STRUCTURAL 1 568 STATEN ISLAND Sanitation Condition 160 569 STATEN ISLAND School Maintenance 7 570 STATEN ISLAND Sewer 190 571 STATEN ISLAND Sidewalk Condition 16 572 STATEN ISLAND Smoking 1 573 STATEN ISLAND Snow 1 574 STATEN ISLAND Special Enforcement 11 575 STATEN ISLAND Special Natural Area District (SNAD) 3 576 STATEN ISLAND Special Projects Inspection Team (SPIT) 2 577 STATEN ISLAND Standing Water 17 578 STATEN ISLAND Street Condition 445 579 STATEN ISLAND Street Light Condition 449 580 STATEN ISLAND Street Sign - Damaged 43 581 STATEN ISLAND Street Sign - Dangling 17 582 STATEN ISLAND Street Sign - Missing 18 583 STATEN ISLAND Tattooing 1 584 STATEN ISLAND Taxi Complaint 1 585 STATEN ISLAND Traffic 14 586 STATEN ISLAND Traffic Signal Condition 101 587 STATEN ISLAND Unleashed Dog 6 588 STATEN ISLAND Unsanitary Animal Pvt Property 8 589 STATEN ISLAND Vacant Lot 25 590 STATEN ISLAND Vending 3 591 STATEN ISLAND Violation of Park Rules 6 592 STATEN ISLAND Water Conservation 10 593 STATEN ISLAND Water Quality 10 594 STATEN ISLAND Water System 326 595 Unspecified Agency Issues 174 596 Unspecified Benefit Card Replacement 387 597 Unspecified Complaint 9 598 Unspecified Compliment 9 599 Unspecified Consumer Complaint 37 600 Unspecified DCA / DOH New License Application Request 164 601 Unspecified DCA Literature Request 14 602 Unspecified DFTA Literature Request 2 603 Unspecified DHS Income Savings Requirement 1 604 Unspecified DOF Literature Request 5797 605 Unspecified DOT Literature Request 123 606 Unspecified DPR Literature Request 7 607 Unspecified Dirty Conditions 3 608 Unspecified Ferry Complaint 4 609 Unspecified Ferry Inquiry 32 610 Unspecified Ferry Permit 1 611 Unspecified For Hire Vehicle Complaint 3 612 Unspecified Found Property 1 613 Unspecified Highway Condition 3 614 Unspecified Invitation 3 615 Unspecified Legal Services Provider Complaint 2 616 Unspecified Maintenance or Facility 4 617 Unspecified Misc. Comments 2 618 Unspecified Noise 5 619 Unspecified Noise - Street/Sidewalk 1 620 Unspecified Noise - Vehicle 1 621 Unspecified OEM Literature Request 29 622 Unspecified Opinion for the Mayor 2 623 Unspecified Parking Card 8 624 Unspecified Request for Information 18 625 Unspecified SCRIE 9 626 Unspecified Sanitation Condition 3 627 Unspecified School Maintenance 8 628 Unspecified Senior Center Complaint 2 629 Unspecified Sewer 4 630 Unspecified Street Condition 6 631 Unspecified Street Light Condition 112 632 Unspecified Taxi Complaint 28 633 Unspecified Taxi Compliment 73 634 Unspecified Traffic Signal Condition 10 635 Unspecified Transportation Provider Complaint 1 636 Unspecified Vacant Lot 3 637 Unspecified Water System 2
Let's use a dictionary to keep track of the largest complaint type per borough.
max_per_borough = {}
for i, row in counts.iterrows():
borough = row["Borough"]
ctype = row["Complaint Type"]
count = row["Count"]
if borough not in max_per_borough:
max_per_borough[borough] = (None, 0)
max_ctype, max_count = max_per_borough[borough]
if count > max_count:
max_per_borough[borough] = (ctype, count)
max_per_borough
{'BRONX': ('HEATING', 4093), 'BROOKLYN': ('HEATING', 4548), 'MANHATTAN': ('HEATING', 3494), 'QUEENS': ('Street Light Condition', 2122), 'STATEN ISLAND': ('Street Light Condition', 449), 'Unspecified': ('DOF Literature Request', 5797)}
Or, we could use pandas! We can take the counts DataFrame and group it by borough:
counts.groupby('Borough').size()
Borough BRONX 118 BROOKLYN 125 MANHATTAN 126 QUEENS 126 STATEN ISLAND 100 Unspecified 43 dtype: int64
Now, we need to find the max value of Count in each borough:
max_counts = counts.groupby('Borough')["Count"].transform(max)
max_counts
0 4093 1 4093 2 4093 3 4093 4 4093 5 4093 6 4093 7 4093 8 4093 9 4093 10 4093 11 4093 12 4093 13 4093 14 4093 15 4093 16 4093 17 4093 18 4093 19 4093 20 4093 21 4093 22 4093 23 4093 24 4093 25 4093 26 4093 27 4093 28 4093 29 4093 ... 608 5797 609 5797 610 5797 611 5797 612 5797 613 5797 614 5797 615 5797 616 5797 617 5797 618 5797 619 5797 620 5797 621 5797 622 5797 623 5797 624 5797 625 5797 626 5797 627 5797 628 5797 629 5797 630 5797 631 5797 632 5797 633 5797 634 5797 635 5797 636 5797 637 5797 Name: Count, dtype: int64
This actually replaces Count with the maximum value of Count for that borough. Now all we need to do is use fancy indexing:
counts[counts['Count'] == max_counts]
Borough | Complaint Type | Count | |
---|---|---|---|
39 | BRONX | HEATING | 4093 |
163 | BROOKLYN | HEATING | 4548 |
292 | MANHATTAN | HEATING | 3494 |
472 | QUEENS | Street Light Condition | 2122 |
579 | STATEN ISLAND | Street Light Condition | 449 |
604 | Unspecified | DOF Literature Request | 5797 |
complaints.info(memory_usage='deep')
<class 'pandas.core.frame.DataFrame'> Int64Index: 111069 entries, 26589651 to 26428033 Data columns (total 51 columns): Created Date 111069 non-null object Closed Date 60270 non-null object Agency 111069 non-null object Agency Name 111069 non-null object Complaint Type 111069 non-null object Descriptor 110613 non-null object Location Type 79022 non-null object Incident Zip 98807 non-null object Incident Address 84441 non-null object Street Name 84432 non-null object Cross Street 1 84728 non-null object Cross Street 2 84005 non-null object Intersection Street 1 19364 non-null object Intersection Street 2 19366 non-null object Address Type 102247 non-null object City 98854 non-null object Landmark 95 non-null object Facility Type 19104 non-null object Status 111069 non-null object Due Date 39239 non-null object Resolution Action Updated Date 96507 non-null object Community Board 111069 non-null object Borough 111069 non-null object X Coordinate (State Plane) 98143 non-null float64 Y Coordinate (State Plane) 98143 non-null float64 Park Facility Name 111069 non-null object Park Borough 111069 non-null object School Name 111069 non-null object School Number 111048 non-null object School Region 110524 non-null object School Code 110524 non-null object School Phone Number 111069 non-null object School Address 111069 non-null object School City 111069 non-null object School State 111069 non-null object School Zip 111069 non-null object School Not Found 38984 non-null object School or Citywide Complaint 0 non-null float64 Vehicle Type 99 non-null object Taxi Company Borough 117 non-null object Taxi Pick Up Location 1059 non-null object Bridge Highway Name 185 non-null object Bridge Highway Direction 185 non-null object Road Ramp 180 non-null object Bridge Highway Segment 219 non-null object Garage Lot Name 49 non-null object Ferry Direction 24 non-null object Ferry Terminal Name 70 non-null object Latitude 98143 non-null float64 Longitude 98143 non-null float64 Location 98143 non-null object dtypes: float64(5), object(46) memory usage: 278.1 MB
complaints["School Name"].value_counts()[:10]
Unspecified 110434 Central Park 33 Prospect Park 13 J Hood Wright Park 13 Union Square Park 10 Hudson River Park 10 Riverside Park 9 Macombs Dam Park 8 Sternberg Park 8 School - PS 273 Wortman 7 Name: School Name, dtype: int64
We're storing the string "Unspecified" 110,434 times! What it really represents is missing data. The same is true of most of the "School" columns. We can see this either by quick inspection of the file, or by noticing that those fields have 111,069 "non-null" values. In some cases this makes sense (e.g., every complaint must have a complaint type associated with it), but in this case the "Unspecified" values should be null because they represent missing data.
complaints["School Name"] = complaints["School Name"].replace({"Unspecified": None})
complaints = complaints.replace({"Unspecified": None})
complaints["School Name"].value_counts()[:10]
Central Park 33 Prospect Park 13 J Hood Wright Park 13 Hudson River Park 10 Union Square Park 10 Riverside Park 9 Macombs Dam Park 8 Sternberg Park 8 School - PS 273 Wortman 7 Mccarren Park 7 Name: School Name, dtype: int64
complaints.info(memory_usage='deep')
<class 'pandas.core.frame.DataFrame'> Int64Index: 111069 entries, 26589651 to 26428033 Data columns (total 51 columns): Created Date 111069 non-null object Closed Date 60270 non-null object Agency 111069 non-null object Agency Name 111069 non-null object Complaint Type 111069 non-null object Descriptor 110613 non-null object Location Type 79022 non-null object Incident Zip 98807 non-null object Incident Address 84441 non-null object Street Name 84432 non-null object Cross Street 1 84728 non-null object Cross Street 2 84005 non-null object Intersection Street 1 19364 non-null object Intersection Street 2 19366 non-null object Address Type 102247 non-null object City 98854 non-null object Landmark 95 non-null object Facility Type 19104 non-null object Status 111069 non-null object Due Date 39239 non-null object Resolution Action Updated Date 96507 non-null object Community Board 111069 non-null object Borough 103962 non-null object X Coordinate (State Plane) 98143 non-null float64 Y Coordinate (State Plane) 98143 non-null float64 Park Facility Name 635 non-null object Park Borough 103962 non-null object School Name 635 non-null object School Number 614 non-null object School Region 90 non-null object School Code 90 non-null object School Phone Number 635 non-null object School Address 635 non-null object School City 635 non-null object School State 635 non-null object School Zip 635 non-null object School Not Found 38984 non-null object School or Citywide Complaint 0 non-null float64 Vehicle Type 99 non-null object Taxi Company Borough 117 non-null object Taxi Pick Up Location 1059 non-null object Bridge Highway Name 185 non-null object Bridge Highway Direction 185 non-null object Road Ramp 180 non-null object Bridge Highway Segment 219 non-null object Garage Lot Name 49 non-null object Ferry Direction 24 non-null object Ferry Terminal Name 70 non-null object Latitude 98143 non-null float64 Longitude 98143 non-null float64 Location 98143 non-null object dtypes: float64(5), object(46) memory usage: 231.2 MB
complaints["Borough"][:10]
Unique Key 26589651 QUEENS 26593698 QUEENS 26594139 MANHATTAN 26595721 MANHATTAN 26590930 MANHATTAN 26592370 QUEENS 26595682 QUEENS 26595195 QUEENS 26590540 MANHATTAN 26594392 BROOKLYN Name: Borough, dtype: object
complaints["Borough"].unique()
array(['QUEENS', 'MANHATTAN', 'BROOKLYN', 'BRONX', 'STATEN ISLAND', None], dtype=object)
complaints["Borough"] = complaints["Borough"].astype('category')
complaints["Borough"][:10]
Unique Key 26589651 QUEENS 26593698 QUEENS 26594139 MANHATTAN 26595721 MANHATTAN 26590930 MANHATTAN 26592370 QUEENS 26595682 QUEENS 26595195 QUEENS 26590540 MANHATTAN 26594392 BROOKLYN Name: Borough, dtype: category Categories (5, object): [BRONX, BROOKLYN, MANHATTAN, QUEENS, STATEN ISLAND]
category_cols = ["Agency", "Agency Name", "Complaint Type", "Descriptor",
"Location Type", "Address Type", "City", "Facility Type",
"Status", "Community Board", "Park Borough"]
for col in category_cols:
complaints[col].astype('category')
complaints.info(verbose=False, memory_usage='deep')
<class 'pandas.core.frame.DataFrame'> Int64Index: 111069 entries, 26589651 to 26428033 Columns: 51 entries, Created Date to Location dtypes: category(1), float64(5), object(45) memory usage: 224.7 MB
grades.csv has (anonymized) grades from a college class. All the students in that class have graduated, and the names of the assignments have been changed to obscure what class this is from.
grades = pd.read_csv('grades.csv')
grades.describe()
hw1 | hw2 | hw3 | hw4 | hw5 | hw6 | Midterm | Final | Total | |
---|---|---|---|---|---|---|---|---|---|
count | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 |
mean | 89.314722 | 179.676944 | 170.680278 | 90.000000 | 85.111111 | 82.576389 | 74.009722 | 76.006944 | 82.227222 |
std | 15.063493 | 28.775365 | 38.551670 | 9.457094 | 12.834324 | 25.453722 | 9.733422 | 10.222350 | 9.891911 |
min | 45.000000 | 39.500000 | 38.000000 | 55.000000 | 50.000000 | 0.000000 | 44.500000 | 45.500000 | 50.950000 |
25% | 89.500000 | 181.000000 | 160.240000 | 90.000000 | 80.750000 | 82.000000 | 67.657500 | 69.375000 | 80.320000 |
50% | 95.000000 | 186.585000 | 189.450000 | 90.000000 | 87.000000 | 88.000000 | 74.815000 | 74.250000 | 84.090000 |
75% | 100.000000 | 195.000000 | 195.427500 | 95.000000 | 95.500000 | 100.000000 | 81.817500 | 83.750000 | 88.405000 |
max | 100.000000 | 200.000000 | 199.150000 | 100.000000 | 100.000000 | 100.000000 | 95.500000 | 96.000000 | 96.950000 |
grades.corr()
hw1 | hw2 | hw3 | hw4 | hw5 | hw6 | Midterm | Final | Total | |
---|---|---|---|---|---|---|---|---|---|
hw1 | 1.000000 | 0.487228 | 0.453700 | 0.409316 | 0.036674 | 0.070779 | 0.175797 | 0.056237 | 0.323693 |
hw2 | 0.487228 | 1.000000 | 0.710138 | 0.604343 | 0.121995 | 0.167828 | 0.402991 | 0.329742 | 0.579868 |
hw3 | 0.453700 | 0.710138 | 1.000000 | 0.613981 | 0.362415 | 0.566413 | 0.353350 | 0.315183 | 0.824843 |
hw4 | 0.409316 | 0.604343 | 0.613981 | 1.000000 | 0.161297 | 0.181675 | 0.177522 | 0.271350 | 0.483939 |
hw5 | 0.036674 | 0.121995 | 0.362415 | 0.161297 | 1.000000 | 0.414708 | 0.172973 | 0.147874 | 0.496866 |
hw6 | 0.070779 | 0.167828 | 0.566413 | 0.181675 | 0.414708 | 1.000000 | 0.168008 | 0.212730 | 0.800330 |
Midterm | 0.175797 | 0.402991 | 0.353350 | 0.177522 | 0.172973 | 0.168008 | 1.000000 | 0.508609 | 0.526797 |
Final | 0.056237 | 0.329742 | 0.315183 | 0.271350 | 0.147874 | 0.212730 | 0.508609 | 1.000000 | 0.584716 |
Total | 0.323693 | 0.579868 | 0.824843 | 0.483939 | 0.496866 | 0.800330 | 0.526797 | 0.584716 | 1.000000 |
grades["hw2"].describe()
count 72.000000 mean 179.676944 std 28.775365 min 39.500000 25% 181.000000 50% 186.585000 75% 195.000000 max 200.000000 Name: hw2, dtype: float64
grades["hw2"] = grades["hw2"] / 2
grades["hw2"].describe()
count 72.000000 mean 89.838472 std 14.387683 min 19.750000 25% 90.500000 50% 93.292500 75% 97.500000 max 100.000000 Name: hw2, dtype: float64
grades["hw3"] = grades["hw3"] / 2
grades[["hw1","hw2","hw3","hw4","hw5","hw6"]].mean()
hw1 89.314722 hw2 89.838472 hw3 85.340139 hw4 90.000000 hw5 85.111111 hw6 82.576389 dtype: float64
grades[["hw1","hw2","hw3","hw4","hw5","hw6"]].mean(axis=1)
0 97.542500 1 94.430833 2 96.308333 3 94.430833 4 96.357500 5 96.357500 6 92.655833 7 93.184167 8 95.298333 9 94.789167 10 95.298333 11 98.942500 12 89.498333 13 94.789167 14 96.308333 15 84.232500 16 97.542500 17 94.059167 18 97.061667 19 96.609167 20 93.435000 21 91.570833 22 93.537500 23 88.260000 24 93.267500 25 84.232500 26 89.149167 27 90.345833 28 90.474167 29 93.435000 ... 42 92.398333 43 87.196667 44 92.655833 45 90.353333 46 86.800000 47 87.455833 48 89.149167 49 94.756667 50 90.353333 51 89.498333 52 87.455833 53 80.174167 54 86.800000 55 87.054167 56 87.054167 57 93.537500 58 78.988333 59 69.807500 60 80.174167 61 87.196667 62 69.807500 63 75.238333 64 72.497500 65 62.604167 66 72.497500 67 49.625000 68 61.416667 69 61.416667 70 62.604167 71 49.625000 dtype: float64
grades["hw avg"] = grades[["hw1","hw2","hw3","hw4","hw5","hw6"]].mean(axis=1)
grades.describe()
hw1 | hw2 | hw3 | hw4 | hw5 | hw6 | Midterm | Final | Total | hw avg | |
---|---|---|---|---|---|---|---|---|---|---|
count | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 |
mean | 89.314722 | 89.838472 | 85.340139 | 90.000000 | 85.111111 | 82.576389 | 74.009722 | 76.006944 | 82.227222 | 87.030139 |
std | 15.063493 | 14.387683 | 19.275835 | 9.457094 | 12.834324 | 25.453722 | 9.733422 | 10.222350 | 9.891911 | 11.118237 |
min | 45.000000 | 19.750000 | 19.000000 | 55.000000 | 50.000000 | 0.000000 | 44.500000 | 45.500000 | 50.950000 | 49.625000 |
25% | 89.500000 | 90.500000 | 80.120000 | 90.000000 | 80.750000 | 82.000000 | 67.657500 | 69.375000 | 80.320000 | 86.158125 |
50% | 95.000000 | 93.292500 | 94.725000 | 90.000000 | 87.000000 | 88.000000 | 74.815000 | 74.250000 | 84.090000 | 90.353333 |
75% | 100.000000 | 97.500000 | 97.713750 | 95.000000 | 95.500000 | 100.000000 | 81.817500 | 83.750000 | 88.405000 | 94.152083 |
max | 100.000000 | 100.000000 | 99.575000 | 100.000000 | 100.000000 | 100.000000 | 95.500000 | 96.000000 | 96.950000 | 98.942500 |
grades = grades[["hw1","hw2","hw3","hw4","hw5","hw6","hw avg","Midterm","Final","Total"]]
grades.describe()
hw1 | hw2 | hw3 | hw4 | hw5 | hw6 | hw avg | Midterm | Final | Total | |
---|---|---|---|---|---|---|---|---|---|---|
count | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 | 72.000000 |
mean | 89.314722 | 89.838472 | 85.340139 | 90.000000 | 85.111111 | 82.576389 | 87.030139 | 74.009722 | 76.006944 | 82.227222 |
std | 15.063493 | 14.387683 | 19.275835 | 9.457094 | 12.834324 | 25.453722 | 11.118237 | 9.733422 | 10.222350 | 9.891911 |
min | 45.000000 | 19.750000 | 19.000000 | 55.000000 | 50.000000 | 0.000000 | 49.625000 | 44.500000 | 45.500000 | 50.950000 |
25% | 89.500000 | 90.500000 | 80.120000 | 90.000000 | 80.750000 | 82.000000 | 86.158125 | 67.657500 | 69.375000 | 80.320000 |
50% | 95.000000 | 93.292500 | 94.725000 | 90.000000 | 87.000000 | 88.000000 | 90.353333 | 74.815000 | 74.250000 | 84.090000 |
75% | 100.000000 | 97.500000 | 97.713750 | 95.000000 | 95.500000 | 100.000000 | 94.152083 | 81.817500 | 83.750000 | 88.405000 |
max | 100.000000 | 100.000000 | 99.575000 | 100.000000 | 100.000000 | 100.000000 | 98.942500 | 95.500000 | 96.000000 | 96.950000 |
grades.corr()
hw1 | hw2 | hw3 | hw4 | hw5 | hw6 | hw avg | Midterm | Final | Total | |
---|---|---|---|---|---|---|---|---|---|---|
hw1 | 1.000000 | 0.487228 | 0.453700 | 0.409316 | 0.036674 | 0.070779 | 0.554078 | 0.175797 | 0.056237 | 0.323693 |
hw2 | 0.487228 | 1.000000 | 0.710138 | 0.604343 | 0.121995 | 0.167828 | 0.704075 | 0.402991 | 0.329742 | 0.579868 |
hw3 | 0.453700 | 0.710138 | 1.000000 | 0.613981 | 0.362415 | 0.566413 | 0.917449 | 0.353350 | 0.315183 | 0.824843 |
hw4 | 0.409316 | 0.604343 | 0.613981 | 1.000000 | 0.161297 | 0.181675 | 0.642298 | 0.177522 | 0.271350 | 0.483939 |
hw5 | 0.036674 | 0.121995 | 0.362415 | 0.161297 | 1.000000 | 0.414708 | 0.512807 | 0.172973 | 0.147874 | 0.496866 |
hw6 | 0.070779 | 0.167828 | 0.566413 | 0.181675 | 0.414708 | 1.000000 | 0.702948 | 0.168008 | 0.212730 | 0.800330 |
hw avg | 0.554078 | 0.704075 | 0.917449 | 0.642298 | 0.512807 | 0.702948 | 1.000000 | 0.351264 | 0.322977 | 0.906070 |
Midterm | 0.175797 | 0.402991 | 0.353350 | 0.177522 | 0.172973 | 0.168008 | 0.351264 | 1.000000 | 0.508609 | 0.526797 |
Final | 0.056237 | 0.329742 | 0.315183 | 0.271350 | 0.147874 | 0.212730 | 0.322977 | 0.508609 | 1.000000 | 0.584716 |
Total | 0.323693 | 0.579868 | 0.824843 | 0.483939 | 0.496866 | 0.800330 | 0.906070 | 0.526797 | 0.584716 | 1.000000 |
Here's some ACA data from 2012:
counties = pd.read_csv('uninsured.csv', index_col='FIPS')
aca_enrollment = pd.read_csv('aca_enrollment.csv', index_col='State')
counties[:5]
Geography | State | 2012 Population | 2012 Estimated Number Uninsured | |
---|---|---|---|---|
FIPS | ||||
1001 | Autauga County | Alabama | 54264 | 5720 |
1003 | Baldwin County | Alabama | 184805 | 25515 |
1005 | Barbour County | Alabama | 24435 | 4223 |
1007 | Bibb County | Alabama | 21438 | 3238 |
1009 | Blount County | Alabama | 57155 | 7183 |
aca_enrollment[:5]
State_Partisanship | Marketplace_Type | Federal_grants_awarded | Medicaid_Expansion | Pre-open_Enrollment_Monthly_Average_Medicaid_and_CHIP_Enrollment_July-Sept_2013 | Total_Medicaid_and_CHIP_Enrollment_April_2014_Preliminary | Number_of_Individuals_Who_Have_Selected_a_Marketplace_Plan | QHP_target | |
---|---|---|---|---|---|---|---|---|
State | ||||||||
Alabama | Rep | FFM | 9772451 | No | 799176.0 | 765286 | 97870 | 77000 |
Alaska | Rep | FFM | 0 | No | 120946.0 | 123212 | 12890 | 21000 |
Arizona | Rep | FFM | 30877097 | Yes | 1201770.0 | 1345403 | 120071 | 131000 |
Arkansas | Dem | FFM | 58149831 | Yes | 680920.0 | 829729 | 43446 | 53000 |
California | Dem | SBE | 1065683056 | Yes | 9157000.0 | 10600000 | 1405102 | 686000 |
We want to answer the following question:
Each state uses a federal exchange (FFM) or a state-based exchange (SBE). What is the percentage of uninsured population in all the FFM states and in all the SBE states?
To answer this question, we need to merge two data sets.
First, we need to compute the total population and uninsured people in each state. We can do this with groupby
and aggregate
.
pop_totals = counties.groupby("State").aggregate(sum)
pop_totals[:5]
2012 Population | 2012 Estimated Number Uninsured | |
---|---|---|
State | ||
Alabama | 4477466 | 628475 |
Alaska | 555247 | 99167 |
Arizona | 6361836 | 1090722 |
Arkansas | 2407692 | 397994 |
California | 37087767 | 6731712 |
We also want the percentage of uninsured:
pop_totals["Percent uninsured"] = 100 * pop_totals["2012 Estimated Number Uninsured"] / pop_totals["2012 Population"]
pop_totals[:5]
2012 Population | 2012 Estimated Number Uninsured | Percent uninsured | |
---|---|---|---|
State | |||
Alabama | 4477466 | 628475 | 14.036399 |
Alaska | 555247 | 99167 | 17.859979 |
Arizona | 6361836 | 1090722 | 17.144768 |
Arkansas | 2407692 | 397994 | 16.530104 |
California | 37087767 | 6731712 | 18.150761 |
aca_enrollment_with_pop = pd.merge(pop_totals, aca_enrollment, left_index=True, right_index=True)
Note: In this case, the join column happened to be the index in both dataframes. That is not always the case. If the "State" column wasn't an index (in either dataframe) we would've written something like:
pd.merge(pop_totals, aca_enrollment, on="State")
aca_enrollment_with_pop[:5]
2012 Population | 2012 Estimated Number Uninsured | Percent uninsured | State_Partisanship | Marketplace_Type | Federal_grants_awarded | Medicaid_Expansion | Pre-open_Enrollment_Monthly_Average_Medicaid_and_CHIP_Enrollment_July-Sept_2013 | Total_Medicaid_and_CHIP_Enrollment_April_2014_Preliminary | Number_of_Individuals_Who_Have_Selected_a_Marketplace_Plan | QHP_target | |
---|---|---|---|---|---|---|---|---|---|---|---|
State |
What happened? Turns out the county data has a space before the name of each state!
counties.iloc[0]["State"]
' Alabama'
counties["State"] = counties["State"].str.strip()
pop_totals = counties.groupby("State").aggregate(sum)
pop_totals["Percent uninsured"] = 100 * pop_totals["2012 Estimated Number Uninsured"] / pop_totals["2012 Population"]
aca_enrollment_with_pop = pd.merge(pop_totals, aca_enrollment, left_index=True, right_index=True)
aca_enrollment_with_pop[:5]
2012 Population | 2012 Estimated Number Uninsured | Percent uninsured | State_Partisanship | Marketplace_Type | Federal_grants_awarded | Medicaid_Expansion | Pre-open_Enrollment_Monthly_Average_Medicaid_and_CHIP_Enrollment_July-Sept_2013 | Total_Medicaid_and_CHIP_Enrollment_April_2014_Preliminary | Number_of_Individuals_Who_Have_Selected_a_Marketplace_Plan | QHP_target | |
---|---|---|---|---|---|---|---|---|---|---|---|
State | |||||||||||
Alabama | 4477466 | 628475 | 14.036399 | Rep | FFM | 9772451 | No | 799176.0 | 765286 | 97870 | 77000 |
Alaska | 555247 | 99167 | 17.859979 | Rep | FFM | 0 | No | 120946.0 | 123212 | 12890 | 21000 |
Arizona | 6361836 | 1090722 | 17.144768 | Rep | FFM | 30877097 | Yes | 1201770.0 | 1345403 | 120071 | 131000 |
Arkansas | 2407692 | 397994 | 16.530104 | Dem | FFM | 58149831 | Yes | 680920.0 | 829729 | 43446 | 53000 |
California | 37087767 | 6731712 | 18.150761 | Dem | SBE | 1065683056 | Yes | 9157000.0 | 10600000 | 1405102 | 686000 |
by_exch_type = aca_enrollment_with_pop.groupby("Marketplace_Type").aggregate(sum)[["2012 Population", "2012 Estimated Number Uninsured"]]
by_exch_type
2012 Population | 2012 Estimated Number Uninsured | |
---|---|---|
Marketplace_Type | ||
FFM | 192600064 | 30222152 |
SBE | 101813370 | 14174778 |
by_exch_type["Percent uninsured"] = 100 * by_exch_type["2012 Estimated Number Uninsured"] / by_exch_type["2012 Population"]
by_exch_type
2012 Population | 2012 Estimated Number Uninsured | Percent uninsured | |
---|---|---|---|
Marketplace_Type | |||
FFM | 192600064 | 30222152 | 15.691662 |
SBE | 101813370 | 14174778 | 13.922315 |
Similarly, we can do this by party:
by_party = aca_enrollment_with_pop.groupby("State_Partisanship").aggregate(sum)[["2012 Population", "2012 Estimated Number Uninsured"]]
by_party["Percent uninsured"] = 100 * by_party["2012 Estimated Number Uninsured"] / by_party["2012 Population"]
by_party
2012 Population | 2012 Estimated Number Uninsured | Percent uninsured | |
---|---|---|---|
State_Partisanship | |||
Dem | 130659500 | 17769259 | 13.599669 |
Rep | 163753934 | 26627671 | 16.260782 |