A text parser basically takes a continuous flow of text-based input
and breaks it down or extracts it into various pieces. The key to
that extraction is having recognizable (and consistent) delimiters or
patterns.
I think the biggest problem you'd have would be whether or not various
obituaries have a consistent enough pattern to extract the data,
especially across various publications. You could certainly look for
key phrases like "passed away" or "born" or "survived by" to parse the
data into pieces, then parse those apart with other delimiters. For
example, once you found "survived by", a semi-colon could be used to
delimit between each type of survivor, as in:
That last line could then be parsed by the comma delimiter, giving you:
Clean it up to remove the "and" and covert the first word of
"grandchildren" to "Grandchild" and you'd end up with:
The number of variations on names may make it difficult to determine
exactly what the last name is. In this case, the pattern is
straightforward, but if you start adding "Jr." or "III" or two word
last names like "Le Clair", it gets more difficult.
However, I would suspect most algorithms you need already exist on the
net. I suspect such a routine has already been needed for one thing
or another...