Update: GQ-Pat now has over 334 million sequences

Back in July we reported that there were 300 million sequences in GQ-Pat, including 256 million nucleotide sequences and over 45 million protein sequences.  And these protein sequences aren’t just automated translations of nuceotides like TrEMBL. All of these sequences are in fact found in patents and patent applications from patent authorities around the world.

View the latest GQ-Pat Statistics

This year we’ve added 75,000 documents and 27 million sequences, making GQ-Pat even bigger than before.

To put this accomplishment in perspective, when the Human Genome Project formally began in 1990, there were fewer than 40,000 sequences in GenBank, before being transferred from Stanford to the newly created National Center for Biotechnology Information (NCBI).

As of this month, according to the NCBI’s GenBank Statistics page, there are 193 million nucleotide sequences in GenBank/EMBL/DDBJ consortium, the world’s gold standard for sequence databases.*


Updated on a weekly basis, the GenomeQuest database, GQ-Pat, is the most comprehensive and up-to-date IP sequence database available.

Not only does GQ-Pat have more sequences than GenBank, these sequences help searchers in other ways as well:

  1. The sheer size of the database itself helps organizations save money through efficiency and the fact that important search results won’t be missed.
  2. Sequences in GQ-Pat are well annotated because all of them have been found in patents, making them more valuable to researchers. Patent information includes descriptions of a particular invention, including the way in which the invention is used, the inventors, the owners, biological information about the sequence, its function, and so on.
  3. Researchers using GQ-Pat can obtain results much sooner than those using public databases like GenBank because patents are typically filed before publications are drafted.

We’re so pleased that more and more researchers are turning to GQ-Pat to search sequences for a huge variety of life science related projects. When it comes to researching or protecting intellectual property, the quality of the results are often only as good as the size of the database they come from. That’s why we’re dedicated to maintaining the world’s largest. Of course, we’re also adding rich annotations and making sure updates are added on a weekly basis.

If you’ve never tried it, and you’re interested in searching our 334 million sequence (and growing) database for yourself, there’s never been a better time to get in touch about a free trial.


* There are another 50 million protein sequences in Uniprot, the leading protein sequence database, although 98% of Uniprot is TrEMBL, which consists of 49 million unreviewed computer-generated protein translations of nucleotide sequences already in the nucleotide databases.


Get Started with a GenomeQuest Trial

Ready to search patents for sequences like professionals do? It’s easier than you think! Start now your Free Trial access the GQ Life Sciences Suite and you’ll even have access to experts from the GQ Life Sciences team to help you along.





Search Patents Using Biological Sequences




No Comments

Be the first to start a conversation

Leave a Reply

  • (will not be published)