INDEPENDENT ASSESSMENT CONFIRMS HIGH ACCURACY

At Gender-API, we take the accuracy of our service seriously. Recently, an independent assessment was conducted to validate the reliability of our service in determining gender based on first names. The findings were encouraging, showcasing our API’s precision and effectiveness in handling diverse names across multiple countries.
Study Overview
A validation study, undertaken by Jim Hagberg from the University of Maryland to evaluate the error rate of our gender-identification service compared to manually verified online sources. The study analyzed first names extracted from research papers published in three scientific journals:
- Journal of Applied Physiology
- Medicine and Science in Sports and Medicine
- International Journal of Sports Medicine
The analysis included 500 first names that were not gender-oriented and were unknown to the researcher. These names were validated using online searches for images or gender-specific pronouns linked to the authors.
Methods of Gender Identification
The study used three independent methods to identify gender:
- Traditional first name recognition is based on widely accepted gender-specific names.
- Personal knowledge of the individual’s gender by the researcher.
- Gender-API leverages AI and a database of over 6 million names across 190 countries to predict gender.
Key Findings
- Out of 500 names, 11 (2.2%) had no results in the Gender-API database.
- Of the remaining 488 names, 435 (89.1%) were correctly identified with at least 80% confidence.
- 392 names (80.3%) were correctly identified with over 90% confidence.
- 359 names (73.5%) were correctly identified with over 95% confidence.
- 282 names (57.8%) were correctly identified with over 98% confidence.
The average confidence level across all predictions was 94% ± 13%. This demonstrates a high degree of reliability, even for less common names.
Error Rate Analysis
A total of 22 names (4.5%) showed discrepancies between Gender-API predictions and online validation. However, when applying an 80% confidence threshold, misclassifications dropped to just seven names (1.4%).
Conclusion
This independent validation study confirms that Gender-API is a highly reliable tool for gender identification. With a low misclassification rate of just 1.4% when using an 80% confidence threshold, our API provides accurate and scalable gender classification based on first names.
Download the validation report here as PDF.
Gender-API is a trustworthy choice for researchers, businesses, and analysts looking for a proven, data-driven gender identification solution.