About online surveys

2021-05-08
11 min read

Conducting online surveys is something almost every researcher has to do at one point. Numerous projects like Limesurvey or Google Forms can be used to create, distribute and analyse surveys. They are feature rich and they are easy to use. At the end of the day, you want to focus on your research, right? Online survey services enable us to do that. They are great because they remove a lot of complexity from the whole process of creating an online survey. But is that really true? How much complexity does it involve to create, distribute and analyse surveys? Is it a burden we need big tech to take from us or can we accomplish good results with simple tools we create ourselves? I created my own online survey tool almost from scratch to explore these questions.

My Concerns

Using third party services always comes with a mixed bag of constraints, limitations and even legal issues. The privacy of users and participants is an obvious case. But before coming back to that I want to talk a little about another big issue which is vendor lock-in.

Vendor lock-in

You are locked-in by a vendor when you are dependent on their product to an extend that makes it impossible or very expensive to transition into using a different product. This can be painful when the vendor is suddenly raising prices but it is fatal when the vendor ceases to exist or drops support for a product. Ending up in a vendor lock-in can be circumvented by using products that use open standards. In practice that means that the format of the data I need to provide to use a service is based on an open standard and all results I produce with the product can be extracted and are structured using an open standard. Take email for example. The whole process of writing, sending and retrieving emails is based on open standards. If I decide to use a commercial email provider like gmail the only component that is not extractable from google is my email address as it is connected to the ‘gmail.com’ domain which belongs to google. But if I want to use google without vendor lock-in, I can buy a domain from an independent registrar and use it with gmail. As I am the owner of the domain, I can always migrate all accounts to another provider like protonmail (or to my own email server). Email is just one example. We can look at all kinds of online services, break them down into small components and identify which components use open standards. If some or all of them don’t, chances are high that we can not move away from the service and risk to be locked-in. Avoiding vendor lock-in is vital if we need to use a service over a long stretch of time.

Privacy

Privacy is another big issue when we use online services. Every interaction reveals behavioural patterns. The intricacy of metadata can reveal a lot about who we are and what we do. Take this quote from the paper Privacy Issues in Internet Surveys by Cho and La Rose written in 19991.

Other sites gather so much data that the sheer volume of information raises privacy concerns. For example, one widely publicized offer gave users free personal computers in exchange for rights to their personal information and web surfing habits and even mandated a monthly minimum usage requirement to ensure that sufficient data were obtained

What Cho and La Rose wrote about in 1999 is called surveillance capitalism by others today. Today we do not need to trade privacy to get access to a personal computer but when we use our smartphone we are still caught in this scheme. It is important to me to stress that we are free to make the choice of engage in this trade if we find it reasonable (for example when using google’s services) but I think we should never make this choice for other people (especially if we want them to help us). Surveys participants do not only leave their metadata behind. The answers themselves may contain confidential information we need to protect. We should be conscious of that.

The Motivation

The above thoughts motivated me to ask: Is risking our participants privacy and our own autonomy as we slip into a vendor lock-in a good and reasonable trade off compared to the struggle of managing a service by ourselves without involvement of a third party? I looked at the components of online survey service and developed a minimalistic web service myself.

The online survey process

An online survey usually involves 4 steps

  1. Creating a service
  2. Providing the survey to users
  3. Gathering results
  4. Use tools to analyse the results

The four steps usually follow one after the other and there are several consideration that come to mind for each of the steps when we look at them individually. For each of the steps I will look for a minimalistic solution which will help us to evaluate the value of an online tool that provides a solution for the steps.

Creating a survey

When we create a survey, the biggest chunk of our work is coming up with questions and how we would like to ask them. Which details are necessary, which questions should we not ask and how can we avoid bias both on our side and on the side of the participants? Writing down the survey outline is the easier part which is usually aided by a tool. There are a hand-full of different question types which have attributes like the available answers to choose from or range of a scale the participant can pick from. Each question and the survey itself has some meta attributes like a title, the order of the questions and so on. We can easily represent a survey in a simple format like yaml.

title: Awesome survey
subtitle: Just a test
pages:
- title: Page 1
  questions:
  - title: How do you feel
    type: checkbox
    choices:
    - text: Fantastic! 😊
      value: 1
    - text: Tired πŸ₯±
      value: 2
  - title: What do you think about this blog post?
    type: checkbox
    choices:
    - text: Great, thanks! πŸ™‚
      value: 1
    - text: Confusing πŸ˜΅β€πŸ’«
      value: 2
    - text: Did not read it πŸ˜›
      value: 3
  - title: remarks?
    type: comment

Providing the survey to users

Now we need to send the survey we created in step 1. to the participants. We need to think about how we want to present the questions to our participants. Do we want to address participants individually or not? Is the survey open or closed? In my case I want the survey to be accessible to everyone. Putting it on a website is fine for me.

Putting the outline on a website requires me to write some HTML for each question type and some CSS to make it look nice. The HTML version of the checkbox type could look like this.

<h3>How do you feel?</h3>
<input type="checkbox" name="answer1" value="1">
<label for="vehicle1"> Fantastic 😊</label><br>
<input type="checkbox" name="answer2" value="2">
<label for="vehicle2"> Tired πŸ₯±</label><br>

on my website it this would generate something like:

How do you feel?


As we do not want to write the HTML every time we change the questions, we need a tool that generates the HTML for us. When looking for a software that does exactly that I found SurveyJS. We can feed our YAML-outline into SurveyJS and it turns it into a nice looking website. Now we only need to put it onto our web server.

Gathering the results

Gathering the results of the survey means that we take the data a participant entered into the form (in their browser) and send it to our server which stores it for us. There are a lot of ways to do this and a very simple approach is to let the server provide a POST endpoint which can take data from a browser and save the results on the servers hard drive. Flask is a very simple web server framework that allows us to define such a POST endpoint.

@app.route("/save-survey", methods=["POST"])
def save_survey():
    survey_data = request.data
    with open(tempfile.mkstemp(dir='.', suffix=".json")[1], "w") as file:
        file.write(survey_data)

The /save-survey endpoint takes the survey data which is contained in the request made by the participant’s browser and saves it in a file. Note that we still need to add some security features before putting this online!

In our HTML which runs in the browser of the participant we just need to add a javascript function which sends the survey data to this endpoint and we are already done πŸ₯³.

function sendDataToServer(survey) {
  axios.post('/save-survey', survey.data)
    .then(function(response){console.log("survey submitted! πŸ₯³")})
    .catch(function(response){console.log("Encountered an error. 😒")})
  }

Analyse the results

Now it is time to get creative! A lot of people have submitted their answers and I want to analyse the data to answer the research question. The good news is that there are a lot of tools I can use to do this. They do not require me to give the survey data to anyone. All I have to do is to convert the survey data to the format that those tools support. For simple calculations and graphing I can use LibreOffice Calc.

The results were saved as JSON objects so I first need to convert them to csv which can be done using free offline tools like jq or a programming language like python. I like to use python so this is what I would do:

import json
import os

with open('results.csv','w+') as results:
  # iterate over all the files in this folder
  for filename in os.listdir('.'):
    result = json.load(os.path.join('.', filename))
    results.write(f'{result.question1},{result.question2},{result.question3}/n')

given that the file structure of my result files looks like this:

{
  "question1": 5,
  "question2": "This is a comment",
  "question3": 1,
}    

The small python program would produce a file called results.csv which contains something like the following output:

5,"This is a comment",1
1,"another comment",44

I can now open that file in LibreOffice and generate charts that use this data. And that’s already it. I managed to conduct a survey based study without involving any third party.

Conclusion

I showed that producing an online survey application with a very basic, minimal approach is quite easy if you know a little bit about programming. So lets review each aforementioned step and identify benefits a third party service can offer us, if any.

Benefits of survey providers

  • Creating the survey

There is really not much value an online service like google can add to make this process easier. I believe that our main task lies in figuring out the questions themselves.

  • Providing the survey to users

Hosting the website ourselves surely comes with some responsibilities and is a lot harder in comparison to the ready to use provider service. We need a server that is reachable 24/7 which comes with the responsibility to learn about how to operate and maintain a server in a secure way. A provider can also help us to reach more participants if our own network (social and technical) is too small.

  • Gathering results

We need to ensure that the data is backed up and stored in a safe way. Online services usually do that for us and they are able to allocate a lot more resources on operational security. An online service can provide us with features that validate individual users by verifying them in some way ensuring that a participant can not submit multiple times. However, developing such features ourselves is not impossible.

  • Analysing the results

Analysing survey data manually often requires advanced programming skills or composition of different tools to get results. Using the tools a service provider id offering to generate reports, diagrams and so on makes it easier for non-technical research to obtain good results but it also reduces the risk of mistakes which could threaten the validity of the study. When choosing a survey tool we should evaluate how much time and resources our team has to crunch the numbers manually.

Benefits of doing it yourself

I agree to what Roberts and Ellen2 write in their paper Exploring ethical issues associated with using online surveys in educational research.

Researchers must act to minimise intrusions on the privacy of research participants at every stage of the research process

I believe that researchers have a responsibility to protect survey participant’s privacy. Especially when we already have the skill-set that allows us to run and maintain our own infrastructure we should feel obliged to build safe and open alternatives to centralised online services. Not only to protect participants but also to help researchers who do not have the knowledge required to do this. If you are a software developer you might want to consider to advocate and contribute to open standards and their free and open implementation.

In breaking down the process and by replacing each of the steps with a minimal implementation I learned that there is not much complexity hidden behind the shiny GUI of the big survey providers out there. This understanding will help me to pick the right solution in the future and to know the limits of a home brew solution.

The good news is that one does not necessarily need to built such a system from scratch. Open Source tools like SurveyJS or Limesurvey Community edition provide a solid foundation which can be easily extended.

Survey in a Bottle

I uploaded the minimalistic survey app discussed in this blog post to my github account and as it was a lot of fun to put it together I will probably improve and extend it in the future.

If you have another minute, please fill out the survey below. And be sure, there are no third parties involved! πŸ™ƒ


  1. Hyunyi Cho and Robert Larose, β€œPrivacy Issues in Internet Surveys,” Social Science Computer Review 17, no. 4 (1999): 421–434, 10.1177/08944393990170040 ↩︎

  2. Lynne D. Roberts and Peter J. Allen, β€œExploring Ethical Issues Associated with Using Online Surveys in Educational Research,” Educational Research and Evaluation 21, no. 2 (February 17, 2015): 95–108, accessed May 8, 2021, 10.1080/13803611.2015.1024421. ↩︎

Avatar
Mark PhD student researching socio-technical enablers/inhibitors of software testing