Text nailing

Text Nailing is an information extraction method of semi-automatically extracting structured information from unstructured documents. The method allows a human to interactively review small blobs of text out of a large collection of documents, to identify potentially informative expressions. The identified expressions can be used then to enhance computational methods that rely on text as well as advanced natural language processing human-interaction with narrative text to identify highly prevalent non-negated expressions, and 2) conversion of all expressions and notes into non-negated alphabetical-only representations to create homogeneous representations.
In traditional machine learning approaches for text classification, a human expert is required to label phrases or entire notes, and then a supervised learning algorithm attempts to generalize the associations and apply them to new data. In contrast, using non-negated distinct expressions eliminates the need for an additional computational method to achieve generalizability.

History

TN was developed at Massachusetts General Hospital and was tested in multiple scenarios including the extraction of smoking status, family history of coronary artery disease, identifying patients with sleep disorders, improve the accuracy of the Framingham risk score for patients with non-alcoholic fatty liver disease, and classify non-adherence to type-2 diabetes. A comprehensive review regarding extracting information from textual documents in the electronic health record is available.
The importance of using non-negated expressions to achieve an increased accuracy of text-based classifiers was emphasized in a letter published in Communications of the ACM in October 2018.

TN as progressive cyber-human intelligence

In July 2018 researchers from Virginia Tech and University of Illinois at Urbana–Champaign referred TN as an example for progressive cyber-human intelligence.

Criticism of machine learning in health care

Chen & Asch 2017 wrote "With machine learning situated at the peak of inflated expectations, we can soften a subsequent crash into a “trough of disillusionment” by fostering a stronger appreciation of the technology’s capabilities and limitations."
A letter published in Communications of the ACM, "Beyond brute force", emphasized that a brute force approach may perform better than traditional machine learning algorithms when applied to text. The letter stated "... machine learning algorithms, when applied to text, rely on the assumption that any language includes an infinite number of possible expressions. In contrast, across a variety of medical conditions, we observed that clinicians tend to use the same expressions to describe patients' conditions."
In his viewpoint published in June 2018 concerning slow adoption of data-driven findings in medicine, Uri Kartoun, co-creator of Text Nailing states that "...Text Nailing raised skepticism in reviewers of medical informatics journals who claimed that it relies on simple tricks to simplify the text, and leans heavily on human annotation. TN indeed may seem just like a trick of the light at ﬁrst glance, but it is actually a fairly sophisticated method that ﬁnally caught the attention of more adventurous reviewers and editors who ultimately accepted it for publication."

Criticism

The human in-the-loop process is a way to generate features using domain experts. Using domain experts to come up with features is not a novel concept. However, the specific interfaces and method which helps the domain experts create the features are most likely novel.
In this case the features the experts create are equivalent to regular expressions. Removing non-alphabetical characters and matching on "smokesppd" is equal to the regular expression /smokes*ppd/. Using regular expressions as features for text classification is not novel.
Given these features the classifier is a manually set threshold by the authors, decided by the performance on a set of documents. This is a classifier, it's just that the parameters of the classifier, in this case a threshold, is set manually. Given the same features and documents almost any machine learning algorithm should be able to find the same threshold or a better one.
The authors note that using support vector machines and hundreds of documents give inferior performance, but does not specify which features or documents the SVM was trained/tested on. A fair comparison would use the same features and document sets as those used by the manual threshold classifier.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...