Rethinking USPTO Applicant Diversity

By Saurabh Vishnubhakat
January 31, 2021

“Demographic data is less worth collecting at all unless it can be collected in a way that is statistically representative and useful for analysis and policy making.”

https://depositphotos.com/31248541/stock-photo-opinion-business-concept.htmlThe Day One Project recently released over 100 proposals for the Biden-Harris administration  to use as roadmaps in crafting science and technology policy. One of those proposals, a Transition Document for the United States Patent and Trademark Office (USPTO), recommends an important and specific step forward for the growing policy agenda on diversity in U.S. innovation.

The USPTO should undertake a pilot program for mandatory collection of demographic data from patent and trademark applicants. This recommendation is a conscious break from past public commentary, which has often urged data collection on a purely voluntary basis.

Studying USPTO applicant diversity is a genuine policy challenge with competing constraints. Although the favorable political consensus to address it is relatively recent, its origins go back a decade to the 2011 America Invents Act (AIA). On balance, the benefits of mandatory data collection are substantial, whereas privacy concerns can be resolved without squandering those benefits with voluntary data collection. The discussion here focuses on patent data, but the arguments also translate readily to the trademark system.

[[Advertisement]]

Growing Interest about Diversity in Innovation

There is now significant interest in systematic research on demographic diversity across the U.S. innovation system. The USPTO-driven National Council for Expanding American Innovation is both a reflection of this interest and a focal point for advancing it, especially after the agency’s public outreach under the SUCCESS Act. In addition to the USPTO’s final report to Congress in October 2019, the Chief Economist of the USPTO has also published the companion paper Progress and Potential, first in February 2019 and updated in July 2020 with an empirical profile of women’s participation in U.S. patenting.

Meanwhile, Senators Mazie Hirono (D-HI), Thom Tillis (R-NC), and Chris Coons (D-DE) recently asked the USPTO for details about gender disparity in the patent bar as reported in a 2020 scholarly article by Mary Hannon as well as prior studies from 2011 and from 2014. The USPTO’s response specifically suggested the possibility of further empirical study.

Finally, beyond innovators and their representatives in the bar, empirical research on diversity has also extended to the USPTO’s own examiner corps. For example, a 2019 working paper by Deepak Hegde and Manav Raj has shown statistically significant differences between male and female patent examiners as to diligence, average work quality, and quantity of output—as well as in their respective likelihoods of promotion and work preferences. In each of these contexts, diversity is of growing policy interest, and gender diversity is especially salient.

The Problem of Reliable, Replicable Data

What these and various other studies share is a common set of difficulties in ensuring that the underlying data is reliable, replicable, and capable of supporting statistically sound inferences. For example, the 2014 study on gender diversity in the patent bar began with the entire public register of attorneys and agents admitted to the USPTO, but gender was only estimated from first names, not confirmed. Moreover, although the data dictionary used for inferring gender was a publicly available report from the Census Bureau, that report was published in 1995—based on data that is now four decennial census periods old.

The 2019 patent examiner study by Hegde and Raj improved on this considerably by relying on a professional vendor with a database of over 1.2 million unique personal and family names. This makes for higher-quality estimation, but the vendor’s proprietary algorithm creates a barrier to replicability, which is especially important for policy making.

Even the USPTO’s response to Senators Hirono, Tillis, and Coons was candid about the limited quality of the agency’s statistics, noting that they “do not show the complete gender data” but do provide “some insight into the possible gender breakdown” of the patent bar. The analysis was based on the use of honorific prefixes among patent bar applicants, with “Mr.” and “Ms.” corresponding to men and women. One problem is that the relevant USPTO form offers not two choices but five: “Mr.,” “Ms.,” Mrs.,” “Miss,” and “Dr.” In particular, the final option “Dr.” is not likely to be evenly distributed between men and women across different technology areas. Moreover, all of this information is optionally self-reported, creating further potential for non-random bias in the data.

The Call for Mandatory Data Collection

The USPTO is well aware of these data reliability issues. Back in March 2012, the agency published a methodology for studying the diversity of applicants as required by Section 29 of the AIA. That methodology included matching public patent data with confidential Census Bureau data to determine demographic attributes including race, gender, veteran status, age, economic status, education, geography, and much else.

However, as later reported in a June 2015 memorandum, even this data-matching effort was “only partially successful.” The relatively basic information in USPTO data—name, town, and state—was not enough to disambiguate inventors, especially common names in large population centers. Even with full and detailed demographic data at the Census Bureau, the USPTO’s own sparse data collection allowed only a modest-quality match of 64.3%. That is, only 64.3% of U.S.-resident inventors listed in USPTO patent data could be matched to Census Bureau data.

That memorandum highlighted the likely risk of statistical bias in voluntarily self-reported data. It recognized strong support among public commentators, especially the influential view of the American Intellectual Property Law Association (AIPLA), for voluntary data collection—but also noted that none of the respondents who voiced concern about respondent privacy addressed the problem of data quality.

Importantly, the memorandum concluded that “for the USPTO to study patent applicant diversity further, there must first be a resolution to the tension under current law between the statistical rigor of mandatory surveys and the public support and existing authority for voluntary surveys.”

Statistical Rigor amid Privacy Concerns

The tension that the USPTO identified may be summed up like this. Demographic data is less worth collecting at all unless it can be collected in a way that is statistically representative and useful for analysis and policy making. Thus, the problem with voluntary approaches is that they would almost surely suffer from selection effects, including self-selection among respondents, whose magnitudes and directions would be difficult to estimate or correct. The risks of selection and other biases in the data are especially problematic given the already low-quality match (64.3%) of USPTO data with Census Bureau data.

Meanwhile, demographic data cannot be collected in violation of the Paperwork Reduction Act, the Privacy Act, and the Census Bureau’s own confidentiality obligations. Indeed, though the USPTO’s own patent data is publicly available, once matched with Census information, even the resulting data is prohibited from release because it includes commingled confidential data. Thus, any effort at mandatory data collection must not run afoul of these legal constraints.

Privacy Safeguards and Authorizing Legislation

Between these competing constraints, the law is more moveable than the demands of statistical rigor. Accordingly, the USPTO’s focus for collecting demographic data should be twofold.

First, the agency should protect the data from unauthorized disclosure outside the agency and from undue influence inside the agency on patent examination or other processes. In particular, the USPTO should not allow the availability of demographic information about patent applicants to enable bias, whether conscious or unconscious, on the part of patent examiners.

Second, the agency should ensure that its legal authority to collect demographic information is clear and specific. The USPTO’s response to Senators Hirono, Tillis, and Coons was quite sensible in this regard, expressly connecting the collection of more comprehensive data to a need for relevant authorizing legislation—and offering technical assistance on such legislation.

The Flip Side of Privacy Concerns

Finally, there is another, largely unexamined, dimension to the concerns for USPTO applicant privacy. Recent research cautions that unconscious bias may already exist in patent examination to varying degrees across technology centers and art units. A 2012 report of the National Women’s Business Council [Part I; Part II] shows nearly identical trends for the patents filed versus patents granted for both women and men.

However, despite this trend, a peer-reviewed 2018 paper by Kyle Jensen and coauthors showed that patent examiners tend to favor male inventors and judge applications with a female name more harshly: applicants with common names from which female inventors can easily be identified were 8.2% less likely to be granted a patent, whereas those with uncommon names that are harder to guess were only 2.8% less likely. This suggests that there may already be some unconscious bias at work. If so, such bias is likely to be rooted in demographic inferences that may be drawn from inventor information that is already available.

It does not necessarily follow that collecting more information and better information about applicants will cause even greater gender bias or disparity. To the contrary, by restricting access to any new demographic information that the USPTO collects—especially keeping it firewalled from the examination process—the agency could minimize the day-to-day effect of that information on examiner operations. The USPTO’s more systematic and complete collection of demographic information may even aid in identifying and mitigating existing bias.

Conclusion: A USPTO Pilot Program

This, in sum, is the analytical and historical case for the pilot program proposed in the Day One Project Transition Document. Indeed, as law professor and former Obama administration adviser Colleen Chien has aptly observed, the USPTO is “an innovation agency that generates its own fees” and thus “has a less politicized mandate as well as a strong culture of piloting.” A soundly constructed, statistically informative pilot program would do much to guide the USPTO’s decision making in the difficult balance between gathering useful data and respecting privacy values.

Image Source: Deposit Photos
Author: tashatuvango
Image ID: 31248541

The Author

Saurabh Vishnubhakat

Saurabh Vishnubhakat is a Professor of Law and Professor of Engineering at Texas A&M University, and a Research Fellow at the Duke Law Center for Innovation Policy. He writes and teaches on intellectual property, administrative law, and federal litigation, especially from an empirical perspective.? Professor Vishnubhakat’s research has been cited in federal appellate and trial court opinions, agency reports and rulemaking, and over two dozen Supreme Court briefs. Until 2015, Professor Vishnubhakat served in the United States Patent and Trademark Office as principal legal advisor to that agency’s first two chief economists. He was also a faculty fellow at the Duke Law School and a postdoctoral associate at the Duke Center for Public Genomics.

For more information or to contact Professor Vishnubhakat please visit his personal website.

Warning & Disclaimer: The pages, articles and comments on IPWatchdog.com do not constitute legal advice, nor do they create any attorney-client relationship. The articles published express the personal opinion and views of the author as of the time of publication and should not be attributed to the author’s employer, clients or the sponsors of IPWatchdog.com. Read more.

Discuss this

There are currently 10 Comments comments.

  1. Anon January 31, 2021 2:14 pm

    Without being too flippant, why Binary?

    What to do about trans (both directions?)

  2. Pro Say January 31, 2021 4:28 pm

    Sorry — but no.

    Anyone’s — including any gov agency’s — desire for demographic data (including especially via mandatory disclosure) does not — and indeed must never — override American’s right to privacy.

    Never.

    Regardless of any alleged “policy reasons.”

    “Conclusion: A USPTO Pilot Program”

    This is one program that should remain . . . pilotless.

  3. Anon January 31, 2021 7:24 pm

    Pro Say,

    Sorry but no. What you are viewing as a matter of privacy has never been such a right (i.e., census data has always been permitted to require the type of data that you are considering to be off limits.

  4. AnotherAnon February 1, 2021 7:10 am

    Do you know which other professions have gender disparities? Nursing. Teaching. Why isn’t your local nursing board loosening the requirements to become a nurse in order to attract more men? The truth is that biological differences in men and women lead to interests in different types of careers.

    If you want to increase diversity at the patent bar level, the solution is to increase diversity in the high school and college STEM class level. Anyone who was a STEM major in college knows that the classes contain a high percentage of men. Don’t change the rules of the game in the 4th quarter just because the outcome isn’t what you had hoped for. Besides, in 2021 one can simply change one’s pronouns and become a different gender. What does this have to do with patents again?

  5. Pro Say February 1, 2021 11:02 am

    Anon — enshrined in law or not, American’s right to privacy trumps all rights (and no; I’m not a supporter of that poor excuse for a president).

    As we have seen time and time again, our data — including that obtained by gov agencies — is regularly used for purposes other than the purpose for which the data was allegedly originally obtained.

    To say nothing of all the hacking of gov databases over the years. Hacking which stole millions of Americans’ personal — and in some cases very personal — information.

    Indeed, such mandated personal data collection would result in suppressing patent and trademark applications.

    Especially by those most (and rightly) concerned about — yet once again — being forced to reveal personal information to yet another gov agency.

    Such personal information must never be required in order to obtain IP protection for innovations.

    Never.

  6. Anon February 1, 2021 12:28 pm

    Pro Say,

    I hear what you want and understand WHY you want it.

    But legally, you are barking about a non-existent legal right.

    You would actually have to have a law put in place first, then you could try to fit this situation into that law.

    Any such attempt though would run directly into the opposite precedent that I mentioned (in regards to Census data).

  7. Anon February 1, 2021 2:20 pm

    AnotherAnon,

    A bit tongue in cheek, but do you know another profession that has severe gender disparities?

    Motherhood.

    There has to be some legislation against that….

    😉

  8. rm February 1, 2021 6:40 pm

    Why not just anonymize the information that examiners get. Double patenting checks can be de-identified using identity tokens.

    As an examiner, I don’t even read the names of the inventors and I doubt seriously anyone else does.

  9. AnonAnonAnon February 2, 2021 4:10 pm

    How will the policy operate for inventors outside the US? What if the the racial/ethnic categories in the home countries doesn’t match up with the way the U.S. government categories them?

    For example, my understanding is that the government of the UK doesn’t officially recognize “Hispanic/Latino” as a separate ethnicity. How should a UK citizen of Mexican ancestry categorize him or herself, by the UK standard or the US one?

  10. ipguy February 2, 2021 9:44 pm

    @8

    “I don’t even read the names of the inventors and I doubt seriously anyone else does”

    When I was an Examiner, I read the names of the inventors because it was a great way to find relevant prior art by the same inventor(s). There’s nothing more vicious than an Examiner citing an inventor’s own prior art against them (when applicable).