Category Archives: Tools

HTAi15 IRG Advanced Searching Workshop: Understanding and Using Text Analysis Tools

IRG Text Analysis 15The final session of the IRG Advanced Searching Workshop was divided up into 4 parts and was rather intensive. Well, the entire workshop was intense! A lot was covered. Julie Glanville from York Health Economics Consortium spoke briefly about text analysis tools (which was covered in more detail at the HLA event earlier this year) and Carol Lefebrve talked about the 2002- project to test the Cochrane Highly Sensitive Search Strategy for RCTs (developed in 1993-4) to determine whether it still performed well with as little terms as possible. Since the development of the Medline filter, a few changes had occurred – the addition of the Controlled Clinical Trial publication type in MeSH (quasi-randomised trials), better trial reporting due to CONSORT guidelines, and the development of text analysis tools. It was through testing with a gold standard set of RCTs and a set of non-trial records using WordStat that the best identifier term was Controlled Clinical Trial PT. But due to indexing changes (RCTs and CCT PT double-indexing cessation) reassessment led to the best identifiers (those terms locating most RCTs) was  RCT MeSH OR CCT PT. This was one of the issues with the filter that Carol mentioned (always be aware of indexing policies and changes!) and the other was the non-text terms to catch non-indexed records. Siw Waffenschmidt and Elke Hausner from the Institute for Quality and Efficiency in Healthcare (IQWIG) discussed generating search strategy development test sets for data analysis and guidance for identifying subject terms and text-words. The guidance is from the recently published EUnetHTA Process of information retrieval for systematic reviews and health technology assessments on clinical effectiveness (still in 2nd draft stage). Hausner spoke about the work she and other IS researchers did in comparing the conceptual and objective approach to search strategy development, one which is elaborated in this journal article: Development of search strategies for systematic reviews: validation showed the noninferiority of the objective approach.  Basically, the research showed that a conceptual  strategy developed by Cochrane with more synonyms was not superior to a objective search strategy on the same topic developed by IQWIG. However, the objective approach is not faster than the conceptual. Time saved is not the issue here though, it is the quality of the search. IQWIG demonstrated with their projects that the conceptual approach can produce high quality strategies that are stable over time and more suited to complex questions.  Take home points: text analysis tools are here to stay! It will take time to learn this approach but the plus side is that it produces strategies of equal quality to those developed using the conceptual approach as well as data to demonstrate strategy development and decision-making.

Text mining with Julie Glanville

The two advanced searching workshops went brilliantly with lots of group discussions and questions from the audience. The only downside to the day was the too-cold air-conditioning. Can you concentrate with icy air being blasted on you? Luckily the day was warm  so lots (including me) went outside during breaks to warm up. Lots of ground was covered over the day and gave everyone lots to think about.

The first workshop was about using text mining tools to build search strategies. I had heard of PubReMiner – a tool that mines PubMed records clip_image002using words from your query to bring up MeSH terms, top authors and journals, word frequency and other data. I used this tool last week to get a list of words for a search about spiritual care I am working on. Another tool similar to PubReMiner is GoPubMed (you could use both to build strategies as each have different visuals). MeSH on Demand, a tool provided by the National Library of Medicine, takes a piece of text, say a systematic review protocol, and mines PubMed to bring up relevant MeSH terms and PMIDs. If you have a strategy ready, you could use this tool to find out if relevant PMIDs have been captured in your results. Both these tools identify single words and MeSH terms, but what if you want to identify phrases? There is a tool for that and it is called TerMine. The example Julie used was a full Medline record, but you can use larger pieces of text. Then there is EndNote which you can use to analyse frequency data from databases other than Medline. There is a bit of work involved to set it up to do this though. And I guess that the records that you analyse would have to be relevant ones in order to build up a list of terms. If you can do this to the full text in EndNote, that would be great! I will have to do some experimenting. The last two freely available tools demonstrated was Quertle and Knalij. Knalji is fun tool that looks like it demonstrates relationships between concepts. And Quertle, which I had forgotten about, now has relationship terms (called power terms) that you can use to connect concepts and bring up records mined from the web. And if you want to take a break from looking at heaps of text, you can take an alternate route by using a tool called VOS Viewer.

So now that you have a selection of terms, how do you combine them into a search strategy? The most popular concept model is PICO, but not all questions fit into this model. What if you have questions about public health or adverse effects? What do you do when you have questions that don’t lend themselves to A AND B AND C strategies? This is where searching for multiple combinations comes into play. I did one of these recently and one combination I used didn’t come up with a relevant article I found using a different combination and vice versa.

The last section of this 1/2 day workshop was about ensuring strategy quality. A common method is to have some gold standard articles to see if your strategy  captures those. You could also use the Related Items in PubMed to find relevant articles to test with. Other methods include testing with specialised sets eg CENTRAL or a service called Epistemonikos (Database of Best Evidence). Another way to test your strategy is to work out the ratio of relevant records to irrelevant records. And finally, there is the Capture-Recapture method. This method, which I hadn’t heard of before, is a way of estimating size by caputuring and then recapturing records.

Lunch! Then the following post will be about workshop 2.

Assessing qualitative research for systematic reviews

Do you search for studies for potential inclusion in qualitative systematic reviews? If you do, you might be interested in this quality assessment tool developed by Dr Christopher Carroll and Dr Andrew Booth.  QuaRT aims to assist reviewers with decisions about inclusion and exclusion of qualitative studies by asking questions in 4 domains:  question and study design, selection of participants, method of data collection and method of data analysis.  This is still a new tool and as it has been developed during writing of a health technology assessment, it has not gone through metholdological development, testing and evaluation. If you plan to use this tool, Drs Carroll and Booth would be interested to hear from you.