Glossary of terms

Questionnaire, survey, poll, frequency and other terms simply...

A

Aggregate data

Summarized or in other way statistically processed data (percentages, ratios, etc.). The opposite of aggregate data are disaggregate data.

B

Battery of questions

Questions of the same or similar character and with the same answers can be associated to the battery of questions. The respondent gets a feeling that the number of questions decreased, and answering those questions becomes clearer and faster for him/her. On the other hand, clarity of battery decreases with association of a large number of questions.

C

CAPI

CAPI (Computer Assisted Personal Interviewing) is a technique of questioning when the interviewer asks respondent in person and notes acquired answers to the electronic questionnaire displayed on the portable multimedia device (tablet, laptop, smart phone).

CATI

CATI (Computer Assisted Telephone Interviewing) is a technique of questioning when interviewer asks respondent over the telephone and notes acquired answers to the electronic questionnaire.

CAWI

CAWI (Computer Assisted Web Interviewing) is a technique of questioning using internet access when respondent fills in electronic questionnaire. Respondent can receive the form by e-email or find it on the website. This technique of questioning is not so expensive and time-consuming as CAPI and CATI. But there are few points which have to be taken into account when this technique is used, such as:

Not everyone has a computer or internet access.
Some respondents may worry about the data’s misuse and therefore may fill some (or all) of the answers wrongly.
Results of questioning vary according to method of the selection of the respondents (custom database, online panel, random selection).

Correlation ratio

The correlation ratio informs us about the strength of the non-linear relationship between numerical variables. It can take on values from 0 to 1: The value 0 indicates independence of numerical variables while value close to 1 indicates strong dependence.

Categorization

It is a process of defining the categories which individual variables can assume and with which will be worked further in the research. We can create a category in the form of intervals for the age of respondents or other quantitative parameters (height, weight, etc.) – group more variations of the variable into one category – or, contrary, create separate category for each variation of the variable. Based on the detailed categorization, it is possible to categorize the first degree data, second degree data and data of higher degrees. Categories must be mutually exclusive, so that each respondent's answer can always possibly be included in only one category. Their number should be based on what is to be found and which additional analysis will be performed with acquired data.

Contingency coefficient

By the contingency coefficient we find out the strength of relationship between two qualitative variables. There are two contingency coefficients – Cramér’s and Pearson’s.

Correlation coefficient

The correlation coefficient is used to determine the strength of linear relationship between two numerical variables. It can take on values from -1 to 1:

Negative value indicates a negative relationship when values of one variable increase and values of second variable decrease.
Positive values inform about positive correlation when values of both variables increase.
The more the value of the correlation coefficient is close to -1 or 1, the stronger the linear relationship between the variables is. The value 0 on the other hand indicates that the linear dependence between pair of variables was not detected.

Correlation

The correlation expresses dependence between numerical variables. According to the type of relationship between the variables we calculate the strength of correlation either by using the correlation coefficient (for linear dependence), or the correlation ratio (for nonlinear dependence).

Correlation Analysis

It is a process in which we find out the strength of the relationship between two numerical variables. By using the correlation coefficient we determine the strength of the linear relationship between the pair of variables, by using the correlation ratio we determine the strength of the nonlinear relationship between them.

Coding of responses

It is a process when a code (usually numerical) is assigned to the each question and its categories. It facilitates and speeds up data processing while using computer technology. Code can be assigned in advance to closed questions. In case of open questions it is necessary to study responses first, create mutually exclusive categories according to their content and then assign the codes to those categories.

Conversation

The most common data acquisition method of qualitative research, when experienced interviewer with psychological or sociological education speaks to either one (depth individual interview) or to a number of respondents. Conversation is also used in quantitative research, where larger number of respondents participate and the requirements for the interviewer are not that high.

Customer satisfaction

Information concerning customer satisfaction with product or service, which is important for the success of the company offering products or services on the market. Research can cover immediate satisfaction (immediately after the purchase, or utilization) or cumulated satisfaction (after a longer period or after a number of times the product is used).

Categorization of the second degree data

Comparison of combinations of two selected values and their variations and searching for similarities and differences between them (e.g. satisfaction of women with the product, dissatisfaction of women with the product, satisfaction of men with the product, dissatisfaction of men with the product). For more information on the process see categorization.

Categorization of the first degree data

Investigation of the frequency of individual values and their variables (e.g. number of women and men who filled in the questionnaire). For more information on the process see categorization.

Categorization of higher degree data

Comparison of combinations of higher number of values and their variables and searching for differences and relationships between them (difference with the satisfaction with the product of men with secondary education). For more information on the process see categorization.

Comprehensive investigation

Investigation to which all units of statistical population are subjected, which is time and financially consuming, often impossible.

Characteristics

Attributes of units in population, which can be judged based on:

How many units in population have them:
- Common characteristics – common for all units, which should be added to the researched sample.
- Variable characteristics – vary for units in population, because they are the subject of research.
The form in which they are expressed:
- Verbal characteristics – expressing the information with words and acquiring two (alternative verbal characteristics) or more alternatives (plural verbal characteristics).
- Quantitative characteristics – expressing acquired information in numerical form and expressing either the level of previously verbal characteristic (serial characteristic) or the value obtained through scale, measurement (measurable characteristic).

D

Data

Data are values which provide information on the required phenomena, entities or relationships between them. We can look at them in view of the sources from which they come (primary and secondary data), or based on the extent of their processing (aggregated and disaggregated data).

Data–collection methods

These are the methods by which data on the investigated problems are collected. Several methods of the data collection are used according to the type of the researches: Questioning, observation and experiment are the most common methods of the data collection in case of primary researches. Depth individual interviews and focus groups are on the other hand the most common methods in case of secondary research.

Disaggregate data

Data that were not statistically processed and are available in the form in which they were acquired. The opposite of disaggregate data is aggregate data.

Data collecting

Process using selected methods and techniques in order to obtain data from the respondents. In case of quantitative research, preferred methods of data collection are questioning, observation and experiment, in case of qualitative research the preferred methods are individual and group interviews.

Dependency

Relationship between variables, when changes of one variable influence the other variable. Dependency can be investigated using the correlation ratio and correlation coefficient with the numerical variables and contingency coefficient with the verbal variables.

Data processing

The process of data adaptation for the purpose of their analysis, which encompasses revision of data (revisal of the completeness and logical correctness), categorization and coding.

Depth individual interview (interview)

It is a technique of questioning in a qualitative research when the interviewer speaks with only one respondent face to face and tries to reveal what is happening in respondent’s mind. The deep individual interview should take 60 minutes at the most, than attention of the respondent wavers. This technique is time and money consuming but it allows to get a deep insight into the examined issues.

E

Experiment

Experiment is a method of research which obtains data from the situations specially designed for this purpose. The observer records then the changes in behavior and the relationships to the original settings. Experiments can be performed either in the artificially created environment (laboratory experiments) or in the natural environment (field experiments).

F

Frequency

Frequency informs about the amount of occurrences of individual variables and their variations in the data set, for example about: How many respondents answers a specific question. How many of respondents picked up the alternative answers. We know three types of frequency:

Absolute frequency: It indicates the amount of occurrences of individual responses to the question.
Relative frequency: It informed about the amount of occurrences of individual responses to the question in view of the total amount of responses to the question. It is expressed as a percentage and it is usually more informative than the absolute frequency because it shows how the individual responses are distributed.
Cumulative frequency: This is a gradual loading of values of relative frequencies for individual variation of the answer.

Focus Group

Focus Group is a technique of questioning in qualitative research. Respondents discuss selected topic in a group from six to ten members usually and interact in the presence of trained moderator. The results are recorded in writing or by using multimedia devices (audio, video) and further investigated.

G

Group interviews

See Focus Group.

H

Hypothesis

The hypothesis is a proposition about the condition of selected variable or variables and relationships between them. This proposition has not been confirmed yet. The proposition is either confirmed or refuted by analyzing the data. Based on the formulation we identify two types of hypotheses:

Descriptive hypothesis describes the condition of certain variable.
Explanatory hypothesis informs about relationship between selected variables.

Hypothesis testing

Method of statistical generalization, which enables to verify credibility of assumptions concerning primary data using results from the research of selected unit.

I

Information

Information is the message which we get by organizing and processing the data. It helps us to know objects of interest, environments or relationships among them. We can look at them from many points of view, such as:

Sources from which they are obtained:
- Primary information results from the primary, first hand data which have not been obtained anywhere else yet.
- Secondary information is obtained from previously realized researches.
What the content is: This may be the facts, motives, opinions, etc.
The forms in which is expressed:
- Numerical information is expressed by numerical values.
- Text information is derived from the texts.
- Other information is expressed another way than using numerical values or texts, for example by using sounds or images.
The nature of the investigated phenomenon:
- Quantitative information is measurable, it has a numerical attribute (number, frequency, ratio, etc.).
- Qualitative information is difficult to measure, it does not have a numerical attribute and reflects e.g. opinions, motives or satisfaction.
Accessibility:
- Public information is freely available to the general public.
- Non-public information is accessible only to a closed group of people.

Inexhaustive research

The type of research during which only selected units of statistical population are being examined. The results are therefore applicable only to the selected unit and they need to be subjected to methods of statistical generalization, such as assessment or hypothesis testing, in order to be applicable to the whole statistical population.

Interview

See Depth individual interview.

Investigation

Acquisition of data from individuals, households or other units that can be:

Comprehensive, when all units of statistical population are investigated, which is time and financially consuming, often impossible
Incomprehensive, when only some of the units are selected for the research and acquired data applies only to the selected unit. These data need to be generalized by statistical methods.

L

Linear dependence

Relationship between numerical variables which is direct and can be therefore graphically illustrated with an axis.

M

Median (50% quantile)

The median is also called as 2-quantile. It is a number that splits the data set arranged in ascending order into two pieces with the same frequency distribution. In case of odd number of values the median is a number in the middle of values. In case of an even number of values the median is an arithmetic mean of the two middle values.

Mode

Mode or typical or modal value is a value in the data set with the largest number of occurrences, i.e. with the highest frequency. Mode can be determined for both – numerical and verbal variables.

Market research

See marketing research

Marketing exploration

Single-time activity used to obtain current information about the market based on selected research method. Only simple statistical methods are used to analyze these kind of acquired data.

Middle value

Characteristic of the general value of investigated case, which enables comparison of the case between two or more sets. Also includes mode, median or diameter.

Marketing Research

It is a systematic learning about the market, its participants and the relationships between them in order to obtain data for the needs of strategic and tactical decision-making of marketing managers or to get feedback. Marketing research can be differentiated according to e.g.:

Main aim for the basic (academic, scientific) research and applied (commercial) research.
Extend of research and the nature of investigated data for quantitative research and qualitative research.
Methods of data acquiring for the primary research and secondary research.
Approach to the researched problems for:
- A descriptive research which describes the current condition of the problems.
- A diagnostic research which is looking for causes of incurred condition.
- A prognostic research which tries to predict the future development of the problems.

N

Non-linear dependence

Relationship between numerical variables which is indirect and can be therefore graphically illustrated with another curve than an axis.

O

Omnibus

Research on which number of different ordering parties cooperated and shared the expenses. Each party then receives results connected to his questions.

Observation

Quantitative research method, during which there is no direct contact between the observer and and the observed person, and during which the the observer is recording reactions and behaviour of the observed individual.

P

Poll

By poll we mean one or a few questions on a selected topic which can be found e.g. in magazines, on websites, in person on street or at shopping center. People with more free time tend to participate in this type of survey more often than others (such as seniors, students, women on maternity leaves or parents on parental leaves), it is called self-selection and acquired sample is not representative. Therefore, the poll is convenient way to contact a specific target audience (customers, partners, employees, etc.) or strengthen relationship with this audience, but not a tool to obtain data for strategic decision making.

Primary research

It is a field research within which the required information is obtained directly from respondents – has not been collected for other research purpose yet.

Pivot Chart

It is a chart which visualizes the data from Pivot Table. Variations of the first variable (from the rows of the Pivot Table) are placed on the first chart axis while the variations of the second variable (from the columns of the Pivot Table) are on the second chart axis. The data from the data fields are presented according to chart type e.g. as points (Scatter Chart), lines (Line Chart) or columns (Column Chart).

Pivot Table

The Pivot Table is the result of the cross tabulation – sorting and summarizing two variables. It illustrates the relationship between selected variables in a table: variations of the first variable are presented in the rows of the table, variations of the second variable are placed in columns. In the data fields of the table are sums of cases in which specific variations of variables in corresponding column and row were achieved.

Panel

Group of chosen representatives, which is repeatedly questioned in connection to the similar or the same problem. The data are cheaper to acquire, and it is possible to monitor the evolution of the researched problem over the time.

Primary data

Data, which were not researched before and are therefore acquired from the target group units with the specific research in mind. The opposite of the primary data are the secondary data.

Pre-research

Verification of the process, method and tools of data collection executed on small number of respondents before the initiation of the research project. Usually, the questionnaire, its difficulty, logic and formulation of questions are tested.

Purposive sampling

The process of selecting respondents from the statistical population, based on the judgement of the interviewer and his knowledge of the statistical population, not by chance. Following techniques fall under this category of sampling:

Quota sampling – the attempt to create the most accurate reduced form of the statistical population, while keeping the predefined, crucial characteristics and their proportions (percentage of the statistical population). Units can be selected from these quotas either randomly (data can be generalized to some extent), or based on the interviewer’s judgement (data cannot be generalized).
Typological selection – based on the identification of typical representative of the statistical population (women on maternity leave, pensioners, etc.) which are then investigated.
Appropriate occasion – researcher selects easily accessible respondents.
Appropriate judgement – researcher selects respondents with better chance of receiving necessary information.

Q

Questionnaire

It is a tool for the research method of questioning, a form of questions on a selected topic. Respondent can be asked to answer these questions vicariously through the interviewer CAPI, CATI, or indirectly without the intervention of the interviewer via the internet (CAWI).

Questioning (interviewing)

Qualitative research

It is a type of primary research that is looking for the answers to question: Why? It tries to identify internal processes of respondents and the causes and motives of their behavior which take place in the consciousness and subconscious. Qualitative research brings deep insight into the examined issue and for evaluation of the acquired data the participation of a psychologist is generally required. Generalization of the data to the defined population is impossible or very difficult. The basic methods of qualitative research include depth individual interviews and focus groups.

Quantiles

They are significant values which split the data set arranged in ascending order to pieces with the same frequency distribution. According to how many pieces quantiles split the data set into we talk about e.g.:

Median or 50% quantile: A value which splits the data set into two pieces with the same frequency distribution.
Quartiles or 25% quantiles: Three values (first quartile, median, third quartile) which divide the data set into four pieces with the same frequency distribution.
Octiles or 8-quantiles: Seven values which divide the data set into eight pieces with the same frequency distribution.
Deciles or 10-quantiles: Nine values which divide the data set into ten pieces with the same frequency distribution.
Percentiles or 100-quantiles: Ninety nine values which split the data set into one hundred pieces with the same frequency distribution.

Quantitative Research

It is a type of primary research that is looking for the answers especially to question: How many? To obtain a representative and biggish sample for statistical processing, many of respondents are addressed in a standardized way within the quantitative research. The main methods of quantitative research include questioning, observation and experiment.

Question

Formulation, that requires explanation (information) and which can be defined based on:

Variants of answers:
- Open questions – respondents are not offered any suggestions for answers and so they can answer without any restrictions.
- Closed questions – the interviewed can choose from limited set of variants the one he considers correct or the one that most closely resembles his beliefs.
- Semi closed-ended questions – compromise between open and closed type of questions. Beside the standard variants, the respondent is offered another variant, and often can insert his own textual response.
Purpose:
- Auxiliary questions – used to approach the respondent, explaining situation and rules, under which the research data will be collected.
- Content and result questions – concerning the basis of the research problem, investigating opinions, attitudes, motives and other information used to elicit the research results and recommendations for the resolution of the problem.

R

Research aims and objectives (research goals)

They define what has to be discovered if the acquired data should bring the information important for resolution of research problem. When formulating aims and objectives their number is very important. If it is low, important options can be omitted, if it is high, need for money and time can increase unnecessarily.

Recommendations

Recommendations are suggestions of steps which should help to solve or minimize research problem. They should not be based only on an analysis of the data and conclusions of this work, but also on defined research goals.

Response

Assertion, which reflects respondent‘s opinion, attitude, motives, knowledge or experience in regard to the question.

Research plan

Document, which contains all information concerning the planned research, such as the researched problem, research goals, necessary information and their structure, target group, the means of its selection, techniques and methods of data collection, dates of the research, data processing and the dates for final data presentation.

Random sampling

The group of respondents is selected from the statistical population by chance. Probability rate is either similar for all units, or varies for different units. It is possible to recognize selection:

With replacement, when the unit is returned to the statistical population and can be drawn repeatedly.
Without replacement, when the selected unit is not returned back to the statistical population and can therefore be drawn only once.

Relevance

The fact that data acquired during the research are important for the problem resolution.

Reliability

Credence and accuracy of data, which means that in case of repeated research, the results would be similar.

Respondent

Person participating in the survey, research or questionnaire.

Range

Statistical constant which is used to measure variability of numerical variables or the range of the values of the researched variable around its middle value.

Research problem

Problem of usually marketing character (loss of customers, low revenues, etc.), for which the data is to be acquired.

S

Submitter’s issue

The problem usually connected to market (marketing) issues (loss of customers, low revenues), for the resolution of which the submitter wants to acquire information.

Self-selection

Non-representative technique for the selection of respondents, when respondent decides about the participation in the survey, questionnaire or research.

Secondary data

Data primarily acquired for different research purpose. The opposite of the secondary data is the primary data.

Secondary research

In the secondary research, researchers work with the data that has been acquired for other research purposes. The data can be available in an aggregated form (aggregated data) or in a disaggregated form (disaggregated data).

Statistical evaluation of data

Data processing using statistical methods.

Scaling

When assessment scale is used to record the opinion, attitude or behavior by selecting the position in prepared interval. For clarity, scale may be supplemented with numbers, words, or graphics.

Sampling

The process of selecting units from the statistical population to the sample collection. Might be a random sampling, which enables generalization of the research results for the statistical population, or purposive sampling, which provides little or no generalization.

Sample

Collection of units (individuals, households, etc.), in which important clues crucial for the resolution of the research problem can be found. These units were selected from the statistical population through sampling.

Statistical population

All units (population, households, etc.), where attributes important for the resolution of the research problem can be found.

T

Target audience (target group)

Group of people (customers, potential customers, partners etc.) which we want to address by communication campaign. It is necessary to know e.g. its size, characteristics (demographic, geographic, psychographic or socioeconomic criteria) or relationship to the researched topic.

V

Variable

Random unit of research (age, gender, education, satisfaction, etc.), which can gain different values.

Versatility

See variability.

Validity

Ability of the data to express and measure what they are supposed to measure and express, or their legitimacy and functionality.

Variability

Variability is the ability of units to choose different variations of the variables. If this capability does not exist, it would not be necessary to perform an inquiry of more than one respondents. Variability of numerical variables can be measured by e.g. standard deviation which shows how much the values of the variable are scattered around its main value. For qualitative variables you can use index of qualitative variation.