Appendix C — Arrests

BLN Record layout

As of October 15, 2025 (Dec. 2 update), there were 372,266 arrests in the BLN arrest table, which has removed duplicates. See the documentation below for details.

position	name	type	missing	description
1	arrest_id	integer	0	Original row number from the FOIA spreadsheet
2	bln_person_id	character	0	BLN’s person identifier for arrests, the first 10 characters of the original unique identifier if it exists, or a combination of birth year and row number for cases where it doesn’t
3	apprehension_date	Date	0	orignal recorded date of arrest
4	is_last_arrest	logical	0	TRUE if it is the most recent arrest for the person ID
5	apprehension_aor	character	5752	ICE Enforcement and Removal Operations area of responsibility
6	bln_arrest_state	character	3828	BLN’s best estimate of state. See methodology for more detail
7	apprehension_site_landmark	character	8605
8	apprehension_method_recoded	character	0	At-large or Custodial, see common definitions
9	apprehension_criminality	character	0	Original criminality recorded in the arrest table, either no conviction, pending charges or other immigration violator
10	bln_arrest_charge_code	character	0	See common definitions for details
11	bln_charge	character	0	BLN charge description
12	bln_charge_group_code	character	0	2-digit crime code, to make for fewer categories
13	bln_charge_group	character	0	Description of the 2-digit crime code
14	bln_charge_special	character	53153	Special types of charges
15	birth_year	numeric	1
16	citizenship_country	character	0
17	gender	character	0
18	case_status_recoded	character	8931	See see common definitions
19	departed_date	Date	137519
20	departure_country	character	137573
21	final_order_date	Date	144622	Date of the removal order from immigration court
22	detention_facility	character	52399	Name of the detention facility matching this arrest
23	detention_state	character	52567
24	detention_city	character	52469
25	detention_county	character	52567
26	detention_days_after	integer	52399	# of days after the arrest that the detention began. Negative numbers reflect detentions that are recorded as starting just before the arrest
27	detainer_state	character	207066	State of the most recent detainer
28	detainer_facility	character	206910	Name of the most recent detainer facility
29	detainer_city	character	207066
30	detainer_county	character	207117	Detainer county based on geocoding of the facility name and city – approximate
31	detainer_days_before	integer	206906	# of days before the arrest that the detainer was sent. Negative numbers reflect detainers apparently entered up to several days before the arrest

Person identifiers

About 99 percent of the rows in the original FOIA data contained a unique_identifier, which is a 32-character anonymized hash of an immigrant’s A-number (sort of like a Social Security number for people going through the immigration system).

The remainder either had no A-number yet – for example, they’ve never come into contact with immigration officials before – or the person couldn’t be properly identified at the point of arrest.

BLN has shortened this 32-character hash to its first 10 characters.¹ It just makes the ID easier to read in tables. These are lower-case letters and numbers that look random. An example is 002184ec34. We don’t know if this hash changes each time the data is released.

That left about 5,500 rows with no unique identifier for a person. There was no way to guess whether or not these people were arrested more than once because the only identifying information is year of birth, gender, and nationality. Those rows were assigned a unique ID beginning with the year of birth and followed by the original row number. An example is 1990_376607 (We may have to change this in the future if there are more than 999,999 arrests in the table.) This ID will change each time the data is released.

Each row has a column called is_last_arrest, which shows whether or not that was the latest recorded arrest for that person. These duplicates weren’t removed because you may want to know how many people were arrested in a given month or year, and you’d want to include unique people in those counts.

Duplicates

There were about 5,500 apprent true duplicated rows in the original arrest data. These are a combination of exact matches on every column; matches on the person ID and the date-time of arrest, and on the person ID and an arrest within the following 24 hours. We retained the most recent of these, even though it sometimes meant we didn’t have quite as much information on the arrest.

BLN state of arrest

About 60,000 of the original 377,000 rows in the FOIA data lacked the state in which the arrest occurred. BLN has made every effort to find the best possible guess for the correct state.

We felt it crucial to fill in this missing data to avoid over-estimating the increase in arrests in a given area. From July 2025 forward, the omission is extremely rare ; it’s much more common before 2025. This means that if you used the original state as a first step toward filtering the data to your area, you would risk seriously overstating the increase in arrests in late 2025 – you just wouldn’t have any rows for the missing data.

We were able to estimate a state for almost all of the missing information by taking the first of the following four possible s

Use the assigned state from the original data.
If the Area of Responsibility was entirely within one state, use that state. This filled in a lot of the rows from Texas, California and New York City.
Match a detention to an arrest based on the person ID and the date, and use the state of the first detention location. There is some error in areas that cross state lines, such as Kansas City or Washington D.C. , where the first detention location is usually across a state border.
Match a detainer to an arrest based on the person ID and the date. A detainer can be anytime before an arrest and this data only goes back to 2023, so we used the most recent detainer state before the arrest.
The landmark site is often unclear, such as in Atlanta, where they seem to code anything in that area of responsibility as an Atlanta landmark. But elsewhere it’s a lot clearer, such as general metro areas around Minneapolis or Dallas. We counted the number of arrests for each landmark by state in the original data. If one state accounted for at least 90 percent of the arrests at that landmark, we used that state. Otherwise, we left it blank.

This left fewer than 1,000 rows without an assigned state.

BLN crime codes

The original FOIA data only contained an indicator for whether or not the person 1) had a criminal conviction, 2) faced criminal charges without a conviction, or 3) was an “other immigration violator”.². Importantly, there was no detail on what crime was associated with the conviction.

In other tables, such as the detentions, ICE assigns what it calls the “Most serious criminal conviction” for each person. Previous reporting on this coding and a review of forms used by ICE, suggest that this conviction is the one in a person’s history that carried the longest sentence. Generally, violent crimes , drugs and weapon charges are considered more serious than traffic, which in turn can be more serious than entering the country illegally.

We matched the detention and detainer data to the arrests in the process outlined above, and chose the most serious criminal conviction. We used the most recent if there were more than one, under the idea that we wanted to overestimate more serious crimes rather than underestimate them.

Each row in the BLN data has a code and description for detailed and more general crimes. We only assigned a crime code to those that were recorded in the original data as “1 Convicted Criminal”; the other two categories were assigned a code that would indicate there was no criminal conviction in the arrest record. Similarly, if the arrest record indicated a conviction but we couldn’t find one, we assigned it a code to indicate it was unknown.

More detail on the meaning of the individual codes is shown in the “Common data elements” chapter

Simplified categories

At the risk of losing some detail, BLN has simplified some of the categories used in the original data to make for more straightforward analysis.

Apprehension method

The apprehension method refers to the circumstances under which the person has been arrested. We have used ICE’s own categorization suggested by the 2024 ICE annual report, page 15:

At-large arrests, which take place within the community (i.e., outside of jails or prisons).
Custodial arrests, which take place in the confines of a jail or prison, as CAP works with federal, state and local law enforcement partners to take custody of removable noncitizens who have already been arrested by another law enforcement agency for criminal activity

The following 7 categories in the original data were combined to create the “Custodial” grouping: 287(g) Program , CAP Federal Incarceration , CAP Local Incarceration , CAP State Incarceration , Criminal Alien Program , Custodial Arrest , Other Agency (turned over to INS). The remaining 20 categories were combined to create the “At-large” arrests. (Most of these weren’t very descriptive, such as “non-custodial arrest”, or “ERO Reporcessed Arrest”. )

This breakout doesn’t exactly match what some news organizations have done, which is to distinguish “street” arrests from other at-large arrests. A street arrest might not include, for example, worksite raids. We decided to simplify the categories and you can use the term “street arrest” and “at-large” arrest interchangeably.

Detention details

About 85 percent of the arrests could be matched to a detention. We used the Deportation Data Project’s criteria to match a detention to an arrest: The detention must be within 10 days of the arrest, or less than 5 days before the arrest. This caught some of the delays we saw in the arrest dates. The table includes the number of days the arrest was recorded before the detention, with negative numbers indicating the arrest was recorded after the detention.

Detainer details

Less than half of the detainers could be matched to an arrest record. One reason is that detainers are much less likely to have a person identifier. We used the most recent detainer recorded for a person as a match, even if it was several years ago. The reason is that the detainers don’t expire, and the person could have been incarcerated locally or at the state level the entire time. We did not include as a match any detainer after the arrest.

There is a very small chance that this could result in improper duplicates. The data has been checked to make sure that never happened across all of the tables in the collection.↩︎
ICE contends that everyone they arrest is an immigration violator↩︎