Qualitative data analysis: data condensation (aka reduction)

Rather than trying to squeeze my thoughts and experience of all my data analysis in one blog post, I intend to write shorter (!) posts of different stages as I progress. My last blog post was about developing my analysis plan.  Knowing what I know now, I am so thankful I spent the time doing that!

I am following Miles and Huberman’s approach to data analysis.

miles and huberman

The purpose of this post is to share with your my first step of data analysis – data condensation. This used to be called data reduction (Miles and Huberman 1994) but it was changed because data reduction implies “weakening or losing something in the process”.


So immediately following my focus groups and interviews, I took extensive notes about salient factors (more about that in another blog post). From these notes I created a contact summary form as advocated by Miles and Huberman and one of my supervisors which synthesised all this information. This is  a very simple and highly valuable thing to do. I have repeatedly referred back to my contact summary forms throughout this process (if anyone wants the template, just ask). I also transcribed verbatim all my focus groups and interviews myself as soon as I could after data collection. This was a very, very long and at times, laborious task, but again highly valuable for really getting to know my data.


I listened to each audio recording (listening only ~ no note taking). Then I read each transcript (reading only ~ no note taking). Then I listened to each audio re-coding again whilst reading my transcripts. This time I scribbled notes down on a pad and drew various mind maps and diagrams. After all that, I was pretty sure I had immersed myself in my data (even though I hated listening to myself!).

I prepared transcripts for importing into NVivo 10. This involved ensuring consistent format and style and anonymising my participants by allocating each of them a pseudonym and a code to differentiate public, healthcare and media professionals (a blog post about this here). This process took quite a bit of time, but if not done thoroughly, I can see how this could have caused me many problems later on.

1st level coding: I developed a starting coding list based on my theoretical framework and wider literature to get me started (initial deductive approach). I listed these codes onto a coding framework with clear operational definitions so I had a clear understanding of what type of data needed to be assigned to each code. Throughout this stage, codes were revised or removed and additional codes and subcodes were created as new themes emerged from the data (inductive approach). As I revised my codes, each transcript was re-read and re-coded. I made sure at this stage I didn’t try to force my data into anything and that codes and sub-codes were all kept very descriptive.

Pattern coding: This was about working with the 1st level codes and sub-codes so that they could be grouped into more meaning full and general patterns. This process was a little more challenging for me because at times I was aware that my thinking was going a little too fast and that I needed to remain fairly descriptive. I was also frightened about condensing too much and losing some of what I had. However, the beauty of NVivo is that you have an audit trail so if you do need to go back, everything is still there (I saved a copy of my NVivo project at the end of every day). While pattern coding, I examined my data carefully and asked a number of key questions such as: What is happening here? What is trying to be conveyed? What are the similarities? What are the differences? In doing so, I also explored not only the similarities but also the idiosyncrasies and differences. This process took quite a number of iterations before I was happy to move on.

Memoing: to help me through the process of coding, I created LOTS of memos which captured a wide range of my thoughts and concepts. I was also able to link my memos to my data and any external resources such as websites or literature. Again, I cannot stress enough how valuable this has been (and still is). My research journal was also created as a memo.

Propositions: it took me a little while to get my head around what I needed to do here as I have always associated propositions with case study research. This was another lengthy process but has helped so much as I started to gently move from the descriptive stage to a more conceptual and interpretive stage. I went through all my coded data and developed propositions from them – so basically a summary or synthesis of my data. I initially developed 613 propositions then reduced this to 479 following removal of duplications. In order to have a better visualisation of these, I left my computer and turned to flip chart paper. I printed each proposition (within their pattern codes) on different coloured post-it notes and arranged and re-arranged them (lots of times!). This then led me to revise my pattern codes again. Of course, with that, I revisited all my data and yet again re-coded into my final revised pattern codes and sub-codes. Just to say at this point – this doesn’t mean these pattern codes are written in stone. They can be (and will likely be) altered again as my interpretation progresses.

So in a nut shell – that was my data condensation. Obviously we know that qualitative data analysis is not a linear process and requires many, many iterations. While at times, this may be frustrating, it’s necessary and can be fun!

On a final note, if anyone is thinking about a CAQDAS programme, I cannot recommend NVivo 10 enough – I absolutely love it (I cannot comment on any other CAQDAS programme as I have only used NVivo) I know many people prefer manual analysis for a number of reasons, which is absolutely fine. NVivo has helped me hugely to store, manage and interrogate my data (of course it won’t interpret or write up my findings though!). The support you receive from QSR International via many ways is first class also.

I am now in the throes of data display and developing lots of Framework Matrices. Another really exciting stage and one that is continually challenges me on my current thinking.  That will be my next blog post. If you want to ask me any questions about my experience of data condensation, please ask away. Any comments would also be very welcome! I’m really trying to keep my blog posts short, but as you can see, I’m not doing well with that!


#data-analysis, #science