The America Invents Act (“AIA”) was (in my opinion) essentially a total capitulation by Congress to the fantasies of the megacorporations that are by far the biggest customers of the U.S. Patent Office (IBM alone accounts for about 2 percent of all patents issued each year). It is therefore ironic that Big Pharma is finding itself dodging a hail of bullets from the procedural assault weapons that the AIA legislated into existence. More »

Clients of mine know that one of my mantras is: ‘pendency is your friend’. There’s no such thing as “a” patent application – not if you care about staking out meaningful IP rights on technologically demanding subject matter. The minimum “unit” of practicable IP protection is not a patent, it’s an application chain that is kept pending. If you get a patent issued, that’s great, especially if it has strong claims. But whatever the claims, if a patent is worth paying maintenance fees on, it’s worth the (relatively) minor cost to keep the application chain pending.

In this post I’ll explain why. (I’ll leave the details of “how” for another time, except to say that you simply make sure to file a continuation or continuation-in-part application before the current application gets issued as a patent.) More »

As expounded elsewhere in this blog, I built a database of full-text U.S. patents and patent applications and some search tools for doing some data mining in it. As described in this post, I used it to extract a corpus of patents and applications that appear to represent in-house inventions of Intellectual Ventures, and organize them by subject matter.

Intellectual Ventures (“IV”) is the intellectual property powerhouse founded by Nathan Myhrvold after he left his position as Chief Technology Officer of Microsoft. IV receives a lot of criticism in the press and the blogosphere, often being portrayed as the mother of all “patent trolls” (a pejorative term originally popularized by IV co-founder and vice-chairman Peter Detkin, who was in house at Intel at the time). IV owns a lot of patents — according to its own published figures, it has acquired 70,000 patents since its founding, and filed 3,000 patent applications on the inventions of its own team of inventors. (Not everyone thinks this is a good thing.) More »

Hourly rate billing is one of the truly unpleasant aspects of the lawyer-client relationship, for both the client and the lawyer. Personally, I’m trying to shift to mostly flat rate billing, but that isn’t a perfect solution either.

Clients hate hourly rate billing because it’s unpredictable. Lawyers hate it because clients always way underestimate how much time it takes to do things. That’s natural, because clients only see the end product. The client is thinking (and sometimes saying) “why am I getting billed six hours for a five page filing — I could write 5 pages in an hour on a cell phone keypad with one thumb!.” The client doesn’t see all the behind-the-scenes work that went into those five pages — the research, thinking, looking things up, going down false trails, reading 20 pages of fine print to find the relevant passage, etc. So clients always feel as though they’re being overcharged, and lawyers always feel under pressure to cut corners to keep the bill down.

I’ve found that one thing that helps a lot is not charging for client phone calls. Why? Several reasons: More »

Often when reading a patent the focus is not so much on what the patent says, it’s on being sure about what it doesn’t say. If an examiner has cited a patent as prior art, and you want to argue that it’s different from the claimed invention because the claimed invention includes some feature that the cited reference doesn’t disclose, you need to satisfy yourself that the feature is not in the cited reference anywhere.

That can be a very tedious undertaking. The particular reference that finally overcame my inertia and got me to scribble up the python script that is the subject of this post is a patent application, cited as a prior art reference in an office action in one of my cases, that is 76 pages of two-column fine print.  No way does the client want to pay me for the hours it would take to read it in detail. More »

In a probably futile effort to stave off Alzheimers by torturing my brain, I have been trying for a while now to learn to read Chinese. Chinese is crazy difficult, for a number of reasons that I won’t go into here. I don’t have the time to devote to it that it would take to become actually fluent, which as best I can tell would require dying and being reincarnated as a Chinese person plus about 20 years of full time study. So my goal is more modest, just reasonable reading comprehension, mainly for reading patents and doing patent-related text mining. I try to spend half hour or so a day reviewing vocabulary and reading things, to reassure myself that I’m still dumber than a Chinese ten year old.

Trying to learn to read Chinese might seem like a waste of time — why not just use Google translate? It turns out that for most anything of adult reading level and complexity, the output of Google translate for Chinese to English is essentially incomprensible gibberish. Machine translation between Chinese and English is a seriously non-trivial undertaking because the two languages are so different on so many dimensions. Personally, I’m skeptical that the current statistically based approach to machine translation can be made to work here; More »

It’s now possible to download from Google the full text of all issued U.S. patents back to 1976 (here) and all published U.S. patent applications back to 2001 (here). Getting anything useful out of them is not a task for the faint of heart — they use a fairly complicated XML schema, which has changed several times, and it’s a lot of downloading (about 70G, zipped, for 2007 to the present).

To put all that data into a form convenient for searching and extracting statistics, I wrote a Python utility that reads the XML, parses it into a standard set of fields, cleans up most of the unicode weirdnesses, and outputs everything into a single large text file, one field per line, each line beginning with a four letter identifier indicating what part of the document it is. I have posted the latest version on github here, with detailed instructions/description. More »

In a patent application, the claims, the written description, and the drawings are supposed to fit together in a consistent way, according to a few basic rules. Disregard those rules, and — best case scenario — you’ll get to waste a lot of time and money filing corrected drawings and amendments to the specification. Worst case, you’ll get claims rejections that you can’t overcome without filing a continuation-in-part (expensive), and maybe not even then if some prior art has popped up in the meantime.

First, a bit of terminology and background. The claims are what determine the extent of your patent rights. You can think of claims as being made up of some elements, and some relationships between the elements. Referring to the canonical example invention — a chair — the claim might be something like:

A chair, comprising a seat, at least one leg extending downward from the seat, and a back extending upward from the rear portion of the seat.

In this example, the “elements” would be the seat, the leg, the back, and possibly the rear portion of the seat. Claims must also indicate how the elements relate to each other. Here, for example, “extending downward from the seat” indicates the relationship of the leg to the seat. More »

Lately I have been experimenting with some text mining ideas using the U.S. patent corpus which Google has conveniently provided for free download. Each raw data file contains all the patents issued in one week, in an xml format, typically on the order of 50 to 100M compressed (up to 500M when uncompressed).

One of the things I was curious about was the size of the vocabulary used in patent claims. It is commonly supposed that an average educated person has a vocabulary of about 20,000 words. A large English dictionary includes on the order of 250,000. How big a vocabulary is encompassed by the words commonly used in patents?

So I made a crude count of the words in the claims in slightly more than four years of issued U.S. patents (all patents issued from 2009 through January 2013) — a total of 14,717,173 claims from 878,461 patents. I counted the number of distinct words, and the number of times each word appeared, then sorted the list by frequency of appearance. More »

Ex-Googler Michelle Lee has just been appointed to head the U.S. Patent Office’s new Silicon Valley satellite office. The corporate shills are predictably chortling with glee over her oft-aired views on the subject of “patent trolls”:

“By 2009, Lee was talking publicly and blogging about how so-called ‘patent trolls’ were a growing burden for Google, and the tech sector at large. That same year she authored a blog post saying that patent reform was needed “now more than ever.” Of twenty patent lawsuits that had been filed against Google, only two were from companies with any products or services.
. . . .
“Caroline Dennison, a legal advisor at the patent office, said her hiring was a sign of the office’s dedication to better dialogue with the tech sector. ‘[Lee] has been in the trenches with the non-practicing entities in litigation,’ said Dennison. ‘She gets it, she knows what’s going on. And we couldn’t be more thrilled to have her. Director [David] Kappos is committed to this industry, and committed to looking for solutions to this problem. . . . ‘”

Did you see what happened there? “Patent trolls” morphed into “non-practicing entities”? More »