Updated: Oct 15
Coding is a core function in ATLAS.ti that lets you ‘tell’ the software where the interesting things are in your data. Coding in a technical sense simply means assigning a label to a data segment. A better-known term these days is tagging. The goal of tagging is to find the things you tagged using the tag name. The software uses the words ‘code’ and ‘coding’, as almost all the other CAQDAS do. My guess is that this is because of the popularity of grounded theory at the time when the first programs were developed in the late 1980s and early 1990s. Coding in CAQDAS, however, is very different from grounded theory coding in the methodological sense (see Friese, 2016 and 2019).
Coding means that we attach labels to segments of data that depict what each segment is about. Through coding, we raise analytic questions about our data […]. Coding distils data, sorts them, and gives us an analytic handle for making comparisons with other segments of data.
(Charmaz, 2014, p. 4)
Coding is the strategy that moves data from diffuse and messy text to organized ideas about what is going on.
(Richards and Morse, 2013, p. 167)
If you are more comfortable with the idea of tagging, in what follows simply replace the terms ‘code’ and ‘coding’ in your mind with ‘tag’ and ‘tagging’. A code in ATLAS.ti can be a simple description, a concept, a category, a subcategory, or a wildcard that modifies a link in a network. The software itself does not dictate how to use a code. It only provides this entity as an item in the toolbox. In this article, I give you some guidance on how to use the ATLAS.ti toolbox to building an effective coding system that helps you with advancing your analysis.
ATLAS.ti AI Coding
Embarking on your own coding journey, you might initially affix descriptive labels to your text—a process akin to the AI's approach. Yet, as you delve deeper into the data, a shift occurs. You realize that refining these labels into more abstract forms fosters the ability to aggregate akin data segments under a unified code.
A short note on the ATLAS.ti AI coding function. While the ATLAS.ti AI coding function can be useful in some cases (see this article by Christina Silver), it's essential to acknowledge that it does not replace the way a qualitative researcher codes data.
The AI coding tool meticulously analyzes your data on a paragraph-by-paragraph basis, assigning multiple tags to each segment. These initial tags may appear reasonable at first glance. However, they mirror the inherent dissimilarity between machine interpretation and human insight.
When you begin coding yourself, you at first might also attach descriptive labels to your text. Yet, as you read more data, a shift occurs. You realize that refining these labels into more abstract forms fosters the ability to aggregate similar data segments under a unified code. Check out the article: "From Raw Data to Insightful Results: The process of computer-aided qualitative data analysis" where I explain this process in more detail.
The ATLAS.ti AI tool is not capable of this process, at least not yet. This results in an accumulation of hundreds and even thousands of descriptive labels. I recently received an SOS message from a Ph.D. student wondering what he should do with the over 7000 codes the AI coding had generated for 31 interviews.
When you try it out, you will find that the tool will generate some codes on a more aggregated level and thus coding more than just a single instance. An example could be a code with the label "emotions" with 170 codings. The problem is that you have no idea what this is all about. Yes, something about emotions, but this does not help you a lot in your analysis. If you were to take a look at those 170 codings, it is likely that they disguise about ten categories. Examples could be: type of emotions, contexts where emotions occur, triggers for emotions, strategies to deal with emotions, and so on. Some of the codings probably also end up in categories that do not have emotions in their name.
When you give the tool a try, you'll notice that at times it generates codes on a more aggregated level and coding more than just a single instance. For example, it might create a code called "emotions" and tag 170 different data segments. But here's the catch – that label by itself doesn't tell you much about what's going on. While emotions are evidently implicated, this label fails to substantially aid your analytical pursuits.
Now, if you were to take a closer look at those 170 tags, you'd probably find they actually fit into around ten different categories. These categories could encompass facets such as emotional types, contextual occurrences, triggering factors, coping strategies, and others. Some of the codings probably need to be sorted into categories that do not have emotions in their name. If you want to learn more about ATLAS.ti AI coding and understand why it can't replace your own coding efforts, I invite you to watch the following video: Can AI Coding in ATLAS.ti Live Up To The Promises?
After having clarified this, let's move on to how you can approach building a coding frame in ATLAS.ti. If you are very new to coding in qualitative research, you may want to read this blog article first: Cracking the Code: How Qualitative Data Analysis is Like Solving a Puzzle. Rest assured, the process of coding data is far from daunting; it's akin to piecing together a puzzle. You have learned how to categorize things from a very early age. Initially, you classified creatures with four legs emitting "wuff" as dogs. Over time, your knowledge expanded as you differentiated between various types: the compact Poodles or Dachshunds, and the stately German Shepherds or Great Danes, representing the larger breeds.
Remember this simple analogy as you continue reading. The innate ability to categorize is already within you, serving as a foundation for your coding journey. Now, it's a matter of channeling this skill toward qualitative data analysis. You're already a capable coder; all that's left is to apply this aptitude to the realm of qualitative data.
First Steps in Developing a Code System
Unless you want to code deductively using an existing framework, keep an open mind when you begin to code your data, notice as many things as you can and collect them via coding. If you feel that it is important to read all the data first and to write down notes on a piece of paper before you create codes in ATLAS.ti, then this is a suitable way to proceed. If, however, after reading some of your data, you already have some ideas for codes, then go straight ahead and start coding in ATLAS.ti. Do whatever feels most natural to you.
Take a look at the following video to learn the technical part of coding in ATLAS.ti.
At the outset, you will generate lots of new codes. You quickly might have a list of 50 - 100 codes. As time progresses, you will reuse more and more of the codes you already have, and you won’t need to create new ones. You'll eventually reach a saturation point. Technically speaking, this means you will be coding with existing codes. This can be as straightforward as employing these codes from the coding dialogue window or dragging and dropping them from the navigation panel onto the relevant data segments.
At this stage, you have roughly described the various elements in the data. As soon as you reach this point – i.e. you no longer add new codes (or only a few) – it is time to review your coding system. If you do it at a much later stage it will need more work, because then you will have to go through all the documents again to apply newly developed subcategories and recheck all other codings. For this initial phase in the coding process, I suggest focusing on the documents within your data that exhibit the greatest divergence. By doing so, you increase the probability of encountering a wide spectrum of topics and themes.
Let’s assume you have taken your first round of coding up to this point. Those coders who naturally develop a mix of descriptive and abstract codes will have around 100 codes, depending on the project. Smaller student projects may hold around 50–70 codes.
This is also an appropriate time to export your project. This preserves the original coding and allows you to compare it later with more advanced versions of your project. In this way, you can describe in your method section how you got from A to B and C in your project.
The initial process of tidying up and restructuring your primary code list takes place within the software itself. If you undertake this process manually, such as on paper, the subsequent step involves implementing these changes within ATLAS.ti. I'm mentioning this because I've observed researchers taking this approach: they export their code lists to Excel and proceed to sort and arrange them there. While this might seem like an instinctive step, especially if you're more accustomed to Excel at this point, it doesn't contribute to advancing your analysis within ATLAS.ti. As you read on, I'll guide you through the steps to achieve this directly within the ATLAS.ti platform.
If you have noticed a lot of things – let’s say you already have 300 or more codes after coding a few interviews – your codes are probably very descriptive. Coders of this type are referred to in the literature as splinters (Guest et al., 2012; Bernard and Ryan, 2010). If you are a splinter, you need to stop coding new data at this point, review your coding and begin to merge your codes.
As a splinter, you may find it difficult to let go of your codes through merging for fear of losing something. I can assure you that this is not going to happen. After merging and reorganizing your codes, you will have a single code that might hold ten quotations in their original form. This is far better than ten codes that only summarize one data segment each (= one piece of the puzzle). It is no problem for the computer to manage 1,000 or more codes. However, instead of being conducive to your analysis, such a high number of codes will prevent further analysis.
The need to push codes from a descriptive to a conceptual has also been described by Corbin and Strauss:
One of the mistakes beginning analysts make is to fail to differentiate between levels of concepts. They don’t start early in the analytic process differentiating lower-level explanatory concepts from the larger ideas or higher-level concepts that seem to unite them. … If an analyst does not begin to differentiate at this early stage of analysis, he or she is likely to end up with pages and pages of concepts and no idea how they fit together. (2008, p. 165)
As a side note, ATLAS.ti's AI coding makes exactly this mistake. So if it was a qualitative analyst, it would still have a lot to learn.
Building an Effective Coding System: Creating Categories and Subcodes
The first categories that you develop are likely to be provisional, as they are based on very little coding. With more coding, they are likely to change and develop further. I like Saldaña’s idea of first-cycle and second-cycle coding (Saldaña, 2013: 8). The idea of the cycle fits the nature of the N-C-T model, where you have seen that qualitative analysis is cyclical rather than linear.
First-cycle coding, according to Saldaña (2009: 45), refers to those processes that happen during the initial coding. These are the ideas you notice and collect when you begin the coding process. Second-cycle coding is the next step. From experience, I would like to add that there is at least a third and fourth cycle of coding as well. When coding data in ATLAS.ti, the aim of this process is to develop a structured code list based on a subsample of your data. Once you have developed a first structure, you can apply the codes to the remaining data. You will likely continue to make changes to the code list and refine the structure the more you code. But this is OK.
Other authors also describe the coding process in a similar way (see, e.g., Bazeley, 2013; Bazeley and Richards, 2000; Charmaz, 2006; Fielding and Lee, 1998, Kuckartz, 1995; Richards, 2009; Richards and Morse, 2013; Silver and Lewins, 2014). Richards (2009), for example, refers to it as a catalog of codes. As advantages of a well-sorted catalog, she mentions speed, reliability, and efficiency.
The problem, as I see repeatedly in my everyday work, is the translation of this process into mouse clicks and the technicalities of it in a software environment. Even if users know the technical aspects of coding, on the one hand, and read the useful tips, on the other, they often find it difficult to apply these skills. It is not so difficult, but neither is it self-explanatory.
How to Develop Subcodes
The goal in developing subcodes is to achieve a good description of heterogeneity and variance in the data material. In principle, two approaches are possible: subcodes can be developed based on previous knowledge (i.e. known aspects from the theoretical literature) or generated empirically based on the data material. In the following, I will explain the empirically based approach. For those who are familiar with the writing of Bazeley and Richards (2007), this is what they refer to as ‘code-on to finer categories’.
The codes in this project are equivalent to the castle pieces of the jigsaw puzzle. They each hold a loose collection of data segments relating to the various factors like how to define happiness, the positive and negative effects of parenting, reasons for having or not having children, and so on. No subordinated aspects have been coded for yet.
When you start coding, it is sometimes easier to first collect data segments with similar content under one main topic, rather than thinking about potential sub-codes immediately. The next step is to go through each code, one at a time, double-click it and read the coded data segments. While you do that write down ideas for sub-codes on a piece of paper or in a memo. Don't go overboard and write down 20 subcodes for 25 quotations. Let's have a look at the positive effects of parenting:
You find things like - loving my partner more, appreciating my own parents in a new way, bringing me in closer contact with other people, understanding many things about my mother's decision, and loving my partner in two beautiful ways, as a father, and as a husband. Instead of creating two or three subcodes from this, find a term that summarizes the contents of all data segments, e.g. Improved relationships.
You may realize now, why you should not wait until you have collected over 100 data segments under one code label as is the case for "reasons for having children" in the above example. It means you have to read all data segments again, 107 in this case, and create ideas for subcodes before continuing with the next step.
Once you have decided which subcodes you want to create, you can open the Split Code tool: Right-click on the code in the Code Manager and select: Split into Subcodes from the contextual menu.
Next, click on the Add Codes button and enter the labels for your subcodes. Then click Create.
Go through the list of quotations and sort them into the fitting subcode (s) by ticking the box.
It is also possible to select two or more subcodes if you think that the quotation covers more than one aspect. It is quite common that multiple aspects are mentioned in one sentence or meaning unit. In order not to lose the context and for further analytical purposes, it can be better to apply multiple codes rather than to create several non-overlapping quotations for each concept.
From a methodological point of view, the codes need to be mutually exclusive. This means each code needs to have a distinct, unique meaning. The data segments, however, do not have to be assigned in a mutually exclusive manner for qualitative data analysis. This is different in quantitative content analysis, or when your intention is to calculate inter-coder agreement. Then other rules apply.
If you come across a quotation that does not fit anywhere, don't assign it to a subcode. ATLAS.ti automatically creates a subcode "undecided". All codes that have not been assigned to a subcode, will be sorted into the undecided code.
After splitting the code, you can review this subcode and sort the quotations into other codes or remove it. You are still in the initial stages of coding and your thoughts on the data are constantly evolving. When you first read the data segment, you have decided to associate it with this code. If that does not fit, it is okay to either remove the quotation or link it to another code.
Once you are done with assigning quotations to subcodes, click on the blue Split Code button.
Look at the Code Manager to see what has happened. ATLAS.ti has created the sub-codes and sorted all quotations into them.
I recommend selecting all codes of the category and assigning a color, and creating a code group for this new category for later filtering purposes.
Consider the sub-codes developed at this time as provisional; they can change. If you continue coding, it will become easier to find better matching code names. It could also be that you find that a subcode is not suitable at all and that you need to integrate it elsewhere. You have only coded very few documents up to this point and this is not the last version of your code list. But you can already work with it and the more structured your code list, the easier it will become to use it.
The following video shows you all steps as described above using the same sample data. I also show you a second option how you can split codes using the Quotation Manager.
Building Categories from Descriptive Labels
The example project, I am using for this section, holds 47 codes related to the positive and negative effects of being a parent. The aim is to learn how to move the analysis from a descriptive to a conceptual level.
Many of the code labels are based on the words the respondents used and thus are very close to the data. However, the quotations are longer and include more contextual information.
Let me remind you once more of the puzzle analogy: most of the codes in this list represent only individual pieces of the puzzle. The frequency of each code is very low. If you leave this list unchanged without summarizing similar codes, you will inevitably end up in the code swamp.
At this stage, it often happens that analysts use code groups to collect their descriptive codes without further conceptualization. That is not a clever idea. It only results in the list of codes being still unsorted and becoming very long. Further, it complicates later analysis, and it will be difficult to explain your code system to a third person.
Below, you will see that I use code groups as filers, so I can better focus on specific aspects. However, code groups do not replace the development of categories in the code list itself.
Applying the NCT process
The codes in this project stand for things that I noticed in the data. So far, I have not collected much. To master the conglomerate of terms, I can now use code groups to group similar codes. This allows me to reduce the code list and focus on a subset of codes using the filter function.
If you are a splinter, it could be that you have created a few hundred codes at this stage. Effectively condensing this extensive list of codes can greatly assist you in maintaining a clear overview and preventing information from becoming overwhelming.
The next step is to merge these codes. It does not make sense to keep all codes that describe only one or two segments of data. Analysts are often afraid of merging as they fear that they lose something. Rest assured, you do not lose anything. Any quotations you have noticed will not disappear, they are just linked to a code at a higher aggregated level. What you lose are the descriptive labels that clutter your code list.
In the following, I show how you can aggregate descriptive codes to create categories. If you look at the list of codes, you will quickly find that it is thematically about the positive and negative effects of parenting. Therefore, I propose to first look for codes that name a positive effect and to collect them in a code group.
To do this, select the first few codes in the list that you find are about positive effects by holding down the CTRL key (e.g., the codes ‘a bit wiser’, ‘appreciating my own parents more’, ‘becoming a better person’, etc.). Drag these codes into the side panel to the left and enter the name: 'Positive effects'. Gradually add all the other codes that you think describe a positive effect. If you are not sure, look at the corresponding quote(s).
Once you are done, click on the code group. The list of codes is now reduced to the codes that relate to the positive effects of parenting. In this case 21.
Given that this project currently comprises just 47 codes, you might question the necessity of this approach. However, a typical project at this juncture likely encompasses well over 100 codes, and if you are a splinter, you could be managing even a few hundred codes. The challenge arises when the number of codes exceeds your screen's capacity. This makes it difficult to effectively sort and organize them for further refinement. This is were the code groups come in handy.
Look at the reduced list of codes. The task is to find codes that belong to a common concept. The concept could already be represented by one of the codes in the list, such as personal growth. If this is not the case, you need to find abstract labels that you can use to combine similar codes. For the example, I propose to further reduce the list to the following four concepts:
A richer, more meaningful life
To do this, you select all codes that are related to the same concept by holding down the Ctrl key. Right-click and select Merge Codes from the context menu.
In the Merge dialogue, select the code that you want to keep and click the button Merge Codes.
It is also possible to drag and drop codes onto each other to merge them.
Look at the screenshot below. I have merged everthing fitting the concept "personal growth". The code list has become much shorter, and the code ‘personal growth’ now holds 8 quotations.
The next concept is 'improved relationships'. Since there is no code label with that name, I merge all the codes that fit this concept into any one of them. After merging, I rename this code with the name of the concept. After repeating the process for all four concepts, the result looks as follows:
The next step is to subsume them under a category, in this case, "Positive Effects". Should you choose to remove the code group filter at this point, any codes you've associated with the overarching concept of "positive effects" will no longer appear consecutively in the code list due to its alphabetical arrangement.
Highlight all codes in the current filter, right-click and select Create Category Code from Selection.
Add the category code to the code group, highlight all codes, and give them a color. Voilà, there is your new category with subcodes:
Following this, your subsequent action could be to consolidate all the descriptive labels pertaining to the adverse effects of parenting. To further aggregate the codes and create a deeper hierarchy, you can use folders. This is illustrated in the following screenshot:
To follow the above-described process live, check out the following video:
If you want to learn more about ATLAS.ti, visit Qualitative Research Training.
If you are looking for support: join the Qualitative Research Community.
Bazeley, Pat (2013). Qualitative Data Analysis: Pratical Strategies. London: Sage.
Bazeley, Pat and Richards, Lyn (2000). The NVivo Qualitative Project Book. London: Sage.
Bernard, Russel H. and Ryan, Gery W. (2010). Analysing Qualitative Data: Systematic Approaches. London: Sage.
Charmaz, Kathy (2006/2014). Constructing Grounded Theory: A Practical Guide Through Qualitative Analysis. London: Sage.
Corbin, Juliet and Strauss, Anselm (2008/2015). Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory (3rd and 4th ed.). Thousand Oaks, CA: Sage.
Fielding, Nigel G. and Raymond, M. Lee (1998). Computer Analysis and Qualitative Research. London: Sage.
Friese, Susanne (2019). Grounded Theory Analysis and CAQDAS: A happy pairing or remodeling GT to QDA? In Tony Bryant and Kathy Charmaz (eds.), chapter 11. The SAGE Handbook of Grounded Theory. London: Sage.
Friese, Susanne (2016). CAQDAS and Grounded Theory Analysis. MMG Working Paper 16-07, October 2016.
Guest, Greg, Kathleen M. MacQueen, and Emily E. Namey (2012). Applied Thematic Analysis. Los Angeles: Sage.
Kelle, Udo und Kluge, Susann (2010). Vom Einzelfall zum Typus: Fallvergleich und Fallkontrastierung in der qualitativen Sozialforschung. Wiesbaden, VS Verlag.
Kuckartz, Udo (1995). Case-oriented quantification, in U. Kelle (ed.), Computer-Aided Qualitative Data Analysis: Theory, Methods and Practice. London: Sage. pp. 158–66.
Richards, Lyn (2009, 2ed). Handling qualitative data: a practical guide. London: Sage.
Richards, Lyn and Janice M. Morse (2013, 3ed). Readme first: for a user’s guide to Qualitative Methods. Los Angeles: Sage.
Saldaña, Jonny (2021, 4th edition). The Coding Manual for Qualitative Researchers. London: Sage.
Silver, Christine and Lewins, Ann (2014, 2ed / online update 2017). Using Software in Qualitative Research: A Step-by-step Guide. London: Sage.