[PDF] [PDF] 11 Creating new variables - Stata

replace for replacing the values of an existing variable It may not be abbreviated because it alters existing data and hence can be considered dangerous The 



Previous PDF Next PDF





[PDF] Depending on conditions: a tutorial on the cond () function

or reproduction includes attribution to both (1) the author and (2) the Stata Journal Keywords: pr0016, cond(), functions, if command, if qualifier, generate, replace or both string, and—depending on context—variables or single values There are, not surprisingly, other ways of carrying out multiple categorization If



[PDF] 11 Creating new variables - Stata

replace for replacing the values of an existing variable It may not be abbreviated because it alters existing data and hence can be considered dangerous The 



[PDF] 13 Functions and expressions - Stata

Multiple-equation models Generating lags and leads 13 9 Indicator values for levels of factor variables 13 10 1 Generating lags, leads, and differences



[PDF] Title Description Quick start - Stata

recode changes the values of numeric variables according to the rules specified If generate() is not specified, the input variables are overwritten; values 



[PDF] Tabulation of Multiple Responses - Stata

width of response labels, turn on/off labels/names/values, turn on/off breaking wide tables, suppress freq table, □ Misc: generate new indicator variables, 



[PDF] ECO – Stata How-to: Conditions, subsetting data - Toronto: Economics

16 sept 2019 · Advanced: Creating a dummy variable based on a condition 5 9 Advanced: Applying a command to distinct groups of observations using 



[PDF] Stata: Recode and Replace - Population Survey Analysis

Topics: Generating new variables in Stata The general process to generating a new variable is simple First multiple replace statements for each new



[PDF] Speaking Stata: On structure and shape: the case of multiple

graphics, indicator variables, multiple responses, reshape, split, string functions, tabulations values other than zero before we generate a new variable



[PDF] STATA FUNDAMENTALS - Middlebury

The varlist tells Stata what variables to take this action on This is Example: Generating dummy variables that incorporate multiple values of a categorical



[PDF] Useful Stata Commands for Longitudinal Data Analysis

Be careful with missing values: == +∞, this might produce unwanted results For instance, if you want to group a variable X, this is what you get gen Xgrouped 

[PDF] state of climate change 2019

[PDF] state primary nomination paper

[PDF] state representative district map

[PDF] state teaching certificate

[PDF] state the characteristics of oral language

[PDF] states that recognize federal tax treaties

[PDF] static method in java

[PDF] static utility methods in java

[PDF] station france bleu lorraine nancy

[PDF] station radio france bleu paris

[PDF] stationnement gratuit lille

[PDF] statista food delivery industry

[PDF] statistical report sample pdf

[PDF] statistics canada international students

[PDF] statistics class 10 full chapter

11Creating new variables

generate and replace This chapter shows the basics of creating and modifying variables in Stata. We saw how to work with the Data Editor in[GSW] 6 Using the Data Editor-this chapter shows how we would do this from the Command window. The two primary commands used for this are generatefor creating new variables. It has a minimum abbreviation ofg. replacefor replacing the values of an existing variable. It may not be abbreviated because it alters existing data and hence can be considered dangerous. The most basic form for creating new variables isgeneratenewvar=exp, whereexpis any kind ofexpression. Of course, bothgenerateandreplacecan be used withifandinqualifiers. An expression is a formula made up of constants, existing variables, operators, and functions. Some examples of expressions (using variables fromauto.dta) would be2 + price,weight^2or sqrt(gearratio). The operators defined in Stata are given in the table below:Relational Arithmetic Logical (numeric and string)+addition!not>greater than -subtraction|or=>or equal /division<=+string concatenationStata has many mathematical, statistical, string, date, time-series, and programming functions. See

help functionsfor the basics, and see theStata Functions Reference Manualfor a complete list and full details of all the built-in functions. You can use menus and dialogs to create new variables and modify existing variables by selecting menu items from theData > Create or change datamenu. This feature can be handy for finding functions quickly. However, we will use the Command window for the examples in this chapter because we would like to illustrate simple usage and some pitfalls. Stata has some utility commands for creating new variables: Theegencommand is useful for working across groups of variables or within groups of observations. See [ D]egenfor more information. Theencodecommand turns categorical string variables into encoded numeric variables, while its counterpartdecodereverses this operation. See[ D]encodefor more information. Thedestringcommand turns string variables that should be numeric, such as numbers with currency symbols, into numbers. To go from numbers to strings, thetostringcommand is useful. See [ D]destringfor more information.

We will focus our efforts ongenerateandreplace.

1

2[ GSW]11 Creating ne wv ariables

generate There are some details you should know about thegeneratecommand: The basic form of thegeneratecommand isgeneratenewvar=exp, wherenewvaris a new variable name andexpis any valid expression. You will get an error message if you try togeneratea variable that already exists. An algebraic calculation using a missing value yields a missing value, as does division by zero, the square root of a negative number, or any other computation which is impossible. If missing values are generated, the number of missing values innewvaris always reported. If Stata says nothing about missing values, then no missing values were generated. You can usegenerateto set the storage type of the new variable as it is generated. You might want to create an indicator (0/1) variable as abyte, for example, because it saves 3 bytes per observation over using the default storage type offloat. Below are some examples of creating new variables from theafewcarslabdataset, which we created inLabeling values of variablesin[ GSW]9 Labeling data. (To work along, start by opening the automobile dataset withsysuse auto. We are using a smaller dataset to make shorter listings.) The last example shows a way to generate an indicator variable for cars weighing more than 3,000 pounds. Logical expressions in Stata result in1for "true" and0for "false". Theifqualifier is used to ensure that the computations are done only for observations whereweightis not missing. [GSW] 11 Creating new variables3 (A few 1978 cars)

1.?? ?????? ?? ????

2.???? ?? ?? ????

3.????? ????? ? ????

4.?? ????

5.?????? ??? ?? ????

6.????? ????? ?? ????

7.?????? ??? ? ????

(2 missing values generated)

4[ GSW]11 Creating ne wv ariables

replace Whereasgenerateis used to create new variables,replaceis the command used for existing variables. Stata uses two different commands to prevent you from accidentally modifying your data. Thereplacecommand cannot be abbreviated. Stata generally requires you to spell out completely any command that can alter your existing data.

1.?? ?????? ????

2.???? ?? ????

3.????? ????? ????

4.????

5.?????? ??? ????

6.????? ????? ????

7.?????? ??? ????

variable??????already defined r(110); (7 real changes made)

1.?? ?????? ????

2.???? ?? ????

3.????? ????? ????

4.????

5.?????? ??? ????

6.????? ????? ????

7.?????? ??? ????

Suppose that you want to create a new variable,predprice, which will be the predicted price of the cars in the following year. You estimate that domestic cars will increase in price by 5% and foreign cars, by 10%. [GSW] 11 Creating new variables5 One way to create the variable would be to first usegenerateto compute the predicted domestic car prices. Then usereplaceto change the missing values for the foreign cars to their proper values. (3 missing values generated) (3 real changes made) ~?1.?? ?????? ? ???? ??????

2.???? ?? ? ???? ??????

3.????? ????? ? ???? ???????

4.? ???? ???????

5.?????? ??? ? ???? ??????

6.????? ????? ? ???? ???????

7.?????? ??? ? ???? ??????

Of course, becauseforeignis an indicator variable, we could generate the predicted variable with one command: ~? ??????~?1.?? ?????? ? ???? ?????? ??????

2.???? ?? ? ???? ?????? ??????

3.????? ????? ? ???? ??????? ???????

4.? ???? ??????? ???????

5.?????? ??? ? ???? ?????? ??????

6.????? ????? ? ???? ??????? ???????

7.?????? ??? ? ???? ?????? ??????

6[ GSW]11 Creating ne wv ariables

generate with string variables Stata is smart. When you generate a variable and the expression evaluates to a string, Stata creates a string variable with a storage type as long as necessary, and no longer than that.whereis astr1 in the following example:

1.?? ?????? ???????

5.?????? ??? ???????

7.?????? ??? ???????

(3 missing values generated) (3 real changes made)

1.?? ?????? ??????? ?

5.?????? ??? ??????? ?

7.?????? ??? ??????? ?

Variable Storage Display Value

name type format label Variable label?????str1 %9s [GSW] 11 Creating new variables7 Stata has some useful tools for working with string variables. Here we split themakevariable into make and model and then create a variable that has the model together with where the model was manufactured: (1 missing value generated) ~?1.?? ?????? ? ?????? ?????? ?

2.???? ?? ? ?? ?? ?

3.????? ????? ? ????? ????? ?

4.? ?

5.?????? ??? ? ??? ??? ?

6.????? ????? ? ????? ????? ?

7.?????? ??? ? ??? ??? ?

There are a few things to note about how these commands work:

1.ustrpos(s1,s2)produces an integer equal to the first character in the strings1at which

the strings2is found or 0 if it is not found. In this example,ustrpos(make," ")finds the position of the first space in each observation ofmake.

2.usubstr(s,start,len)produces a string of lengthlencharacters, beginning at characterstart

of strings. Ifc1=., the result is the string from characterstartto the end of strings.

3. Putting 1 and 2 together:usubstr(s,ustrpos(s," ")+1,.)will always give the strings

with its first word removed. Becausemakecontains both the make and the model of each car, andmakenever contains a space in this dataset, we have found each car"s model.

4. The operator "+", when applied to string variables, will concatenate the strings (that is, join

them together). The expression"this" + "that"results in the string"thisthat". When the variablemodelwherewas generated, a space (" ") was added between the two strings.

5. The missing value for a string is nothing special-it is simply the empty string"". Thus the

value ofmodelwherefor the car with no make or model is" D"(note the leading space).

6. If your strings might contain Unicode characters, use the Unicode versions of the string functions,

as shown above. See[U] 12.4.2 Handling Unicode strings.quotesdbs_dbs21.pdfusesText_27