The Entity Detection model lets you automatically identify and categorize key information in transcribed audio content.
entity_detection
to true
in the transcription config.
Key | Type | Description |
---|---|---|
entity_detection | boolean | Enable Entity Detection. |
Key | Type | Description |
---|---|---|
entities | array | An array of detected entities. |
entities[i].entity_type | string | The type of entity for the i-th detected entity. |
entities[i].text | string | The text for the i-th detected entity. |
entities[i].start | number | The starting time, in milliseconds, at which the i-th detected entity appears in the audio file. |
entities[i].end | number | The ending time, in milliseconds, for the i-th detected entity in the audio file. |
Entity name | Description | Example |
---|---|---|
account_number | Customer account or membership identification number | Policy No. 10042992; Member ID: HZ-5235-001 |
banking_information | Banking information, including account and routing numbers | |
blood_type | Blood type | O-, AB positive |
credit_card_cvv | Credit card verification code | CVV: 080 |
credit_card_expiration | Expiration date of a credit card | |
credit_card_number | Credit card number | |
date | Specific calendar date | December 18 |
date_interval | Broader time periods, including date ranges, months, seasons, years, and decades | 2020-2021; 5-9 May; January 1984 |
date_of_birth | Date of birth | Date of Birth: March 7,1961 |
drivers_license | Driver’s license number. | DL# 356933-540 |
drug | Medications, vitamins, or supplements | Advil, Acetaminophen, Panadol |
duration | Periods of time, specified as a number and a unit of time | 8 months; 2 years |
email_address | Email address | support@assemblyai.com |
event | Name of an event or holiday | Olympics, Yom Kippur |
filename | Names of computer files, including the extension or filepath | Taxes/2012/brad-tax-returns.pdf |
gender_sexuality | Terms indicating gender identity or sexual orientation, including slang terms | female; bisexual; trans |
healthcare_number | Healthcare numbers and health plan beneficiary numbers | Policy No.: 5584-486-674-YM |
injury | Bodily injury | I broke my arm, I have a sprained wrist |
ip_address | Internet IP address, including IPv4 and IPv6 formats | 192.168.0.1 |
language | Name of a natural language | Spanish, French |
location | Any Location reference including mailing address, postal code, city, state, province, country, or coordinates. | Lake Victoria, 145 Windsor St., 90210 |
marital_status | Terms indicating marital status | Single; common-law; ex-wife; married |
medical_condition | Name of a medical condition, disease, syndrome, deficit, or disorder | chronic fatigue syndrome, arrhythmia, depression |
medical_process | Medical process, including treatments, procedures, and tests | heart surgery, CT scan |
money_amount | Name and/or amount of currency | 15 pesos, $94.50 |
nationality | Terms indicating nationality, ethnicity, or race | American, Asian, Caucasian |
number_sequence | Numerical PII (including alphanumeric strings) that doesn’t fall under other categories | |
occupation | Job title or profession | professor, actors, engineer, CPA |
organization | Name of an organization | CNN, McDonalds, University of Alaska, Northwest General Hospital |
passport_number | Passport numbers, issued by any country | PA4568332; NU3C6L86S12 |
password | Account passwords, PINs, access keys, or verification answers | 27%alfalfa, temp1234, My mother’s maiden name is Smith |
person_age | Number associated with an age | 27, 75 |
person_name | Name of a person | Bob, Doug Jones, Dr. Kay Martinez, MD |
phone_number | Telephone or fax number | |
physical_attribute | Distinctive bodily attributes, including terms indicating race | I’m 190cm tall; He belongs to the Black students’ association |
political_affiliation | Terms referring to a political party, movement, or ideology | Republican, Liberal |
religion | Terms indicating religious affiliation | Hindu, Catholic |
statistics | Medical statistics | 18%, 18 percent |
time | Expressions indicating clock times | 19:37:28; 10pm EST |
url | Internet addresses | https://www.assemblyai.com/ |
us_social_security_number | Social Security Number or equivalent | |
username | Usernames, login names, or handles | @AssemblyAI |
vehicle_id | Vehicle identification numbers (VINs), vehicle serial numbers, and license plate numbers | 5FNRL38918B111818; BIF7547 |
zodiac_sign | Names of Zodiac signs | Aries; Taurus |
How does the Entity Detection model handle misspellings or variations of entities?
Can the Entity Detection model identify custom entity types?
How can I improve the accuracy of the Entity Detection model?