Appendix C — Arrests

BLN Record layout

As of October 15, 2025 (Dec. 2 update), there were 372,266 arrests in the BLN arrest table, which has removed duplicates. See the documentation below for details.

position name type missing description
1 arrest_id integer 0 Original row number from the FOIA spreadsheet
2 bln_person_id character 0 BLN’s person identifier for arrests, the first 10 characters of the original unique identifier if it exists, or a combination of birth year and row number for cases where it doesn’t
3 apprehension_date Date 0 orignal recorded date of arrest
4 is_last_arrest logical 0 TRUE if it is the most recent arrest for the person ID
5 apprehension_aor character 5752 ICE Enforcement and Removal Operations area of responsibility
6 bln_arrest_state character 3828 BLN’s best estimate of state. See methodology for more detail
7 apprehension_site_landmark character 8605
8 apprehension_method_recoded character 0 At-large or Custodial, see common definitions
9 apprehension_criminality character 0 Original criminality recorded in the arrest table, either no conviction, pending charges or other immigration violator
10 bln_arrest_charge_code character 0 See common definitions for details
11 bln_charge character 0 BLN charge description
12 bln_charge_group_code character 0 2-digit crime code, to make for fewer categories
13 bln_charge_group character 0 Description of the 2-digit crime code
14 bln_charge_special character 53153 Special types of charges
15 birth_year numeric 1
16 citizenship_country character 0
17 gender character 0
18 case_status_recoded character 8931 See see common definitions
19 departed_date Date 137519
20 departure_country character 137573
21 final_order_date Date 144622 Date of the removal order from immigration court
22 detention_facility character 52399 Name of the detention facility matching this arrest
23 detention_state character 52567
24 detention_city character 52469
25 detention_county character 52567
26 detention_days_after integer 52399 # of days after the arrest that the detention began. Negative numbers reflect detentions that are recorded as starting just before the arrest
27 detainer_state character 207066 State of the most recent detainer
28 detainer_facility character 206910 Name of the most recent detainer facility
29 detainer_city character 207066
30 detainer_county character 207117 Detainer county based on geocoding of the facility name and city – approximate
31 detainer_days_before integer 206906 # of days before the arrest that the detainer was sent. Negative numbers reflect detainers apparently entered up to several days before the arrest

Person identifiers

About 99 percent of the rows in the original FOIA data contained a unique_identifier, which is a 32-character anonymized hash of an immigrant’s A-number (sort of like a Social Security number for people going through the immigration system).

The remainder either had no A-number yet – for example, they’ve never come into contact with immigration officials before – or the person couldn’t be properly identified at the point of arrest.

BLN has shortened this 32-character hash to its first 10 characters.1 It just makes the ID easier to read in tables. These are lower-case letters and numbers that look random. An example is 002184ec34. We don’t know if this hash changes each time the data is released.

That left about 5,500 rows with no unique identifier for a person. There was no way to guess whether or not these people were arrested more than once because the only identifying information is year of birth, gender, and nationality. Those rows were assigned a unique ID beginning with the year of birth and followed by the original row number. An example is 1990_376607 (We may have to change this in the future if there are more than 999,999 arrests in the table.) This ID will change each time the data is released.

Each row has a column called is_last_arrest, which shows whether or not that was the latest recorded arrest for that person. These duplicates weren’t removed because you may want to know how many people were arrested in a given month or year, and you’d want to include unique people in those counts.

Duplicates

There were about 5,500 apprent true duplicated rows in the original arrest data. These are a combination of exact matches on every column; matches on the person ID and the date-time of arrest, and on the person ID and an arrest within the following 24 hours. We retained the most recent of these, even though it sometimes meant we didn’t have quite as much information on the arrest.

BLN state of arrest

About 60,000 of the original 377,000 rows in the FOIA data lacked the state in which the arrest occurred. BLN has made every effort to find the best possible guess for the correct state.

We felt it crucial to fill in this missing data to avoid over-estimating the increase in arrests in a given area. From July 2025 forward, the omission is extremely rare ; it’s much more common before 2025. This means that if you used the original state as a first step toward filtering the data to your area, you would risk seriously overstating the increase in arrests in late 2025 – you just wouldn’t have any rows for the missing data.

We were able to estimate a state for almost all of the missing information by taking the first of the following four possible s

  1. Use the assigned state from the original data.

  2. If the Area of Responsibility was entirely within one state, use that state. This filled in a lot of the rows from Texas, California and New York City.

  3. Match a detention to an arrest based on the person ID and the date, and use the state of the first detention location. There is some error in areas that cross state lines, such as Kansas City or Washington D.C. , where the first detention location is usually across a state border.

  4. Match a detainer to an arrest based on the person ID and the date. A detainer can be anytime before an arrest and this data only goes back to 2023, so we used the most recent detainer state before the arrest.

  5. The landmark site is often unclear, such as in Atlanta, where they seem to code anything in that area of responsibility as an Atlanta landmark. But elsewhere it’s a lot clearer, such as general metro areas around Minneapolis or Dallas. We counted the number of arrests for each landmark by state in the original data. If one state accounted for at least 90 percent of the arrests at that landmark, we used that state. Otherwise, we left it blank.

This left fewer than 1,000 rows without an assigned state.

BLN crime codes

The original FOIA data only contained an indicator for whether or not the person 1) had a criminal conviction, 2) faced criminal charges without a conviction, or 3) was an “other immigration violator”.2. Importantly, there was no detail on what crime was associated with the conviction.

In other tables, such as the detentions, ICE assigns what it calls the “Most serious criminal conviction” for each person. Previous reporting on this coding and a review of forms used by ICE, suggest that this conviction is the one in a person’s history that carried the longest sentence. Generally, violent crimes , drugs and weapon charges are considered more serious than traffic, which in turn can be more serious than entering the country illegally.

We matched the detention and detainer data to the arrests in the process outlined above, and chose the most serious criminal conviction. We used the most recent if there were more than one, under the idea that we wanted to overestimate more serious crimes rather than underestimate them.

Each row in the BLN data has a code and description for detailed and more general crimes. We only assigned a crime code to those that were recorded in the original data as “1 Convicted Criminal”; the other two categories were assigned a code that would indicate there was no criminal conviction in the arrest record. Similarly, if the arrest record indicated a conviction but we couldn’t find one, we assigned it a code to indicate it was unknown.

More detail on the meaning of the individual codes is shown in the “Common data elements” chapter

Simplified categories

At the risk of losing some detail, BLN has simplified some of the categories used in the original data to make for more straightforward analysis.

Apprehension method

The apprehension method refers to the circumstances under which the person has been arrested. We have used ICE’s own categorization suggested by the 2024 ICE annual report, page 15:

  • At-large arrests, which take place within the community (i.e., outside of jails or prisons).
  • Custodial arrests, which take place in the confines of a jail or prison, as CAP works with federal, state and local law enforcement partners to take custody of removable noncitizens who have already been arrested by another law enforcement agency for criminal activity

The following 7 categories in the original data were combined to create the “Custodial” grouping: 287(g) Program , CAP Federal Incarceration , CAP Local Incarceration , CAP State Incarceration , Criminal Alien Program , Custodial Arrest , Other Agency (turned over to INS). The remaining 20 categories were combined to create the “At-large” arrests. (Most of these weren’t very descriptive, such as “non-custodial arrest”, or “ERO Reporcessed Arrest”. )

This breakout doesn’t exactly match what some news organizations have done, which is to distinguish “street” arrests from other at-large arrests. A street arrest might not include, for example, worksite raids. We decided to simplify the categories and you can use the term “street arrest” and “at-large” arrest interchangeably.

Detention details

About 85 percent of the arrests could be matched to a detention. We used the Deportation Data Project’s criteria to match a detention to an arrest: The detention must be within 10 days of the arrest, or less than 5 days before the arrest. This caught some of the delays we saw in the arrest dates. The table includes the number of days the arrest was recorded before the detention, with negative numbers indicating the arrest was recorded after the detention.

Detainer details

Less than half of the detainers could be matched to an arrest record. One reason is that detainers are much less likely to have a person identifier. We used the most recent detainer recorded for a person as a match, even if it was several years ago. The reason is that the detainers don’t expire, and the person could have been incarcerated locally or at the state level the entire time. We did not include as a match any detainer after the arrest.


  1. There is a very small chance that this could result in improper duplicates. The data has been checked to make sure that never happened across all of the tables in the collection.↩︎

  2. ICE contends that everyone they arrest is an immigration violator↩︎