Serial number ibm spss 19

Dating > Serial number ibm spss 19

Download links:Serial number ibm spss 19Serial number ibm spss 19

Analytics plays an increasingly important role in helping your organization achieve its objectives. The residual component of the series for a particular observation. Availability of Procedures in Distributed Analysis Mode In distributed analysis mode, procedures are available for use only if they are installed on both your local version and the version on the remote server. By default, all estimated models are included in the output. Percentage of total number of models. When working with command syntax, the active dataset name is displayed on the toolbar of the syntax window. A grain processor has received 16 samples from each of 8 crop yields and measured the alfatoxin levels in parts per billion PPB. Included for round-trip compatibility with IBM® SPSS® Modeler. Prefix for Model Identifiers in Output. Installing the Data 6 SPSS Step-by-Step.

This document contains proprietary information of SPSS Inc, an IBM Company. It is provided under a license agreement and is protected by copyright law. The information contained in this publication does not include any product warranties, and any statements provided in this manual should not be interpreted as such. When you send information to IBM or SPSS, you grant IBM and SPSS a nonexclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you. © Copyright SPSS Inc. Preface IBM SPSS Statistics IBM® SPSS® Statistics is a comprehensive system for analyzing data. Examples using the statistical procedures found in add-on options are provided in the Help system, installed with the software. In addition, beneath the menus and dialog boxes, SPSS Statistics uses a command language. Some extended features of the system can be accessed only via command syntax. Those features are not available in the Student Version. Detailed command syntax reference information is available in two forms: integrated into the overall Help system and as a separate document in PDF form in the Command Syntax Reference, also available from the Help menu. IBM SPSS Statistics Options The following options are available as add-on enhancements to the full not Student Version IBM® SPSS® Statistics Core system: Statistics Base gives you a wide range of statistical procedures for basic analyses and reports, including counts, crosstabs and descriptive statistics, OLAP Cubes and codebook reports. Additionally, SPSS Statistics Base offers a broad range of algorithms for comparing means and predictive techniques such as t-test, analysis of variance, linear regression and ordinal regression. Advanced Statistics focuses on techniques often used in sophisticated experimental and biomedical research. It includes procedures for general linear models GLM , linear mixed models, variance components analysis, loglinear analysis, ordinal regression, actuarial life tables, Kaplan-Meier survival analysis, and basic and extended Cox regression. Categories performs optimal scaling procedures, including correspondence analysis. Complex Samples allows survey, market, health, and public opinion researchers, as well as social scientists who use sample survey methodology, to incorporate their complex sample designs into data analysis. © Copyright SPSS Inc. With Conjoint, you can easily measure the trade-off effect of each product attribute in the context of a set of product attributes—as consumers do when making purchasing decisions. Custom Tables creates a variety of presentation-quality tabular reports, including complex stub-and-banner tables and displays of multiple response data. Data Preparation provides a quick visual snapshot of your data. It provides the ability to apply validation rules that identify invalid data values. You can also save variables that record individual rule violations and the total number of rule violations per case. Exact Tests calculates exact p values for statistical tests when small or very unevenly distributed samples could make the usual tests inaccurate. This option is available only on Windows operating systems. Missing Values describes patterns of missing data, estimates means and other statistics, and imputes values for missing observations. Neural Networks can be used to make business decisions by forecasting demand for a product as a function of price and other variables, or by categorizing customers based on buying habits and demographic characteristics. Neural networks are non-linear data modeling tools. It includes procedures for probit analysis, logistic regression, weight estimation, two-stage least-squares regression, and general nonlinear regression. Commercial, government, and academic customers worldwide rely on SPSS Inc. Technical support Technical support is available to maintenance customers. Customers may contact Technical Support for assistance in using SPSS Inc. To reach Technical Support, see the SPSS Inc. Be prepared to identify yourself, your organization, and your support agreement when requesting assistance. Training Seminars SPSS Inc. All seminars feature hands-on workshops. Seminars will be offered in major cities on a regular basis. Additional Publications The SPSS Statistics: Guide to Data Analysis, SPSS Statistics: Statistical Procedures Companion, and SPSS Statistics: Advanced Statistical Procedures Companion, written by Marija Norušis and published by Prentice Hall, are available as suggested supplemental material. These publications cover statistical procedures in the SPSS Statistics Base module, Advanced Statistics module and Regression module. Whether you are just getting starting in data analysis or are ready for advanced applications, these books will help you make best use of the capabilities found within the IBM® SPSS® Statistics offering. Reading Excel 95 or Later Files. Reading Older Excel Files and Other Spreadsheets. Reading IBM SPSS Data Collection Data. Saving Data Files in External Formats. Saving Data Files in Excel Format. Saving Data Files in SAS Format. Saving Data Files in Stata Format. Saving Subsets of Variables. Exporting to a Database. Exporting to IBM SPSS Data Collection. To Select, Switch, or Add Servers. Searching for Available Servers. Opening Data Files from a Remote Server. Inserting line breaks in labels. Applying variable definition attributes to multiple variables. To enter numeric data. To enter non-numeric data. To use value labels for data entry. Data value restrictions in the data editor. Cutting, copying, and pasting data values. To change data type. Finding cases, variables, or imputations. Finding and replacing data and attribute values. Selecting Source and Target Variables. Choosing Variable Properties to Copy. Copying Dataset File Properties. Automatically Generating Binned Categories. User-Missing Values in Visual Binning. Time Series Data Transformations. Add Cases: Dictionary Information. Merging More Than Two Data Sources. Select Cases: Random Sample. Restructure Data Wizard: Select Type. Restructure Data Wizard Variables to Cases : Number of Variable Groups. Restructure Data Wizard Variables to Cases : Select Variables. Restructure Data Wizard Variables to Cases : Create Index Variables. Restructure Data Wizard Variables to Cases : Create One Index Variable. Restructure Data Wizard Variables to Cases : Create Multiple Index Variables. Restructure Data Wizard Variables to Cases : Options. Restructure Data Wizard Cases to Variables : Select Variables. Restructure Data Wizard Cases to Variables : Sort Data. Moving, Deleting, and Copying Output. Changing Alignment of Output Items. Adding Items to the Viewer. Finding and Replacing Information in the Viewer. Copying Output into Other Applications. To Copy and Paste Output Items into Another Application. To Print Output and Charts. Page Attributes: Headers and Footers. Changing display order of elements within a dimension. Moving rows and columns within a dimension element. Transposing rows and columns. Ungrouping rows or columns. Rotating row or column labels. Showing hidden rows and columns in a table. Hiding and showing dimension labels. Hiding and showing table titles. Table properties: cell formats. To hide or show a caption. To hide or show a footnote in a table. Commenting or Uncommenting Text. Scoring the active dataset. Merging model and transformation XML files. Combo Box and List Box Controls. Custom Dialogs for Extension Commands. Example: Tables with layers. Data files created from multiple tables. Controlling column elements to control variables in the data file. Variable names in OMS-generated data files. Linear models predict a continuous target based on linear relationships between the target and one or more predictors. Linear models are relatively simple and give an easily interpreted mathematical formula for scoring. The properties of these models are well understood and can typically be built very quickly compared to other model types such as neural networks or decision trees on the same dataset. This feature is available in the Statistics Base add-on module. Generalized linear mixed models. Generalized linear mixed models cover a wide variety of models, from simple linear regression to complex multilevel models for non-normal longitudinal data. This feature is available in the Advanced Statistics add-on module. Lightweight tables can be rendered much faster than full-featured pivot tables. Although they lack the editing features of pivot tables, they can easily be converted to pivot tables with all editing features enabled. For more information, see the topic Pivot table options in Chapter 17 on p. The new scoring wizard makes it easy to apply predictive models to score your data, and scoring no longer requires IBM® SPSS® Statistics Server. For more information, see the topic Scoring data with predictive models in Chapter 15 on p. Improved default measurement level. For data read from external sources and new variables created in a session, the method for determining default measurement level has been improved to evaluate more conditions than just the number of unique values. Since measurement level affects the results of many procedures, correct measurement level assignment is often important. For more information, see the topic Data Options in Chapter 17 on p. You can now split the editor pane into two panes arranged with one above the other. You can indent or outdent blocks of syntax or automatically indent selections with a format similar to pasted syntax. A new toolbar button allows you to uncomment text that was previously commented out, and a new option setting allows you to paste syntax at the position of the cursor. You can now also navigate to the next or previous syntactical error such as an © Copyright SPSS Inc. For more information, see the topic Using the Syntax Editor in Chapter 13 on p. Database drivers for salesforce. Database drivers for salesforce. Analysts can now connect to salesforce. When you use compiled transformations, transformation commands such as COMPUTE and RECODE are compiled to machine code at run time to improve the performance of these transformations for datasets with a large number of cases. This feature requires SPSS Statistics Server. Statistics portal is a Web-based interface for IBM® SPSS® Collaboration and Deployment Services users that allows them to analyze their data with the power of the SPSS Statistics engine. They run analyses from custom user interfaces authored in SPSS Statistics with the Custom Dialog Builder and stored in their IBM SPSS Collaboration and Deployment Services Repository. Windows There are a number of different types of windows in IBM® SPSS® Statistics: Data Editor. All statistical results, tables, and charts are displayed in the Viewer. You can edit the output and save it for later use. You can edit text, swap data in rows and columns, add color, create multidimensional tables, and selectively hide and show results. You can modify high-resolution charts and plots in chart windows. You can change the colors, select different type fonts or sizes, switch the horizontal and vertical axes, rotate 3-D scatterplots, and even change the chart type. You can edit the output and change font characteristics type, style, color, size. You can paste your dialog box choices into a syntax window, where your selections appear in the form of command syntax. You can then edit the command syntax to use special features that are not available through dialog boxes. If you have more than one open Syntax Editor window, command syntax is pasted into the designated Syntax Editor window. The designated windows are indicated by a plus sign in the icon in the title bar. You can change the designated windows at any time. The designated window should not be confused with the active window, which is the currently selected window. If you have overlapping windows, the active window appears in the foreground. If you open a window, that window automatically becomes the active window and the designated window. Changing the designated window E Make the window that you want to designate the active window click anywhere in the window. E Click the Designate Window button on the toolbar the plus sign icon. For more information, see the topic Basic Handling of Multiple Data Sources in Chapter 6 on p. For each procedure or command that you run, a case counter indicates the number of cases processed so far. For statistical procedures that require iterative processing, the number of iterations is displayed. The message Weight on indicates that a weight variable is being used to weight cases for analysis. Dialog boxes Most menu selections open dialog boxes. You use dialog boxes to select variables and options for analysis. Dialog boxes for statistical procedures and charts typically have two basic components: Source variable list. A list of variables in the active dataset. Only variable types that are allowed by the selected procedure are displayed in the source list. Use of short string and long string variables is restricted in many procedures. Target variable list s. One or more lists indicating the variables that you have chosen for the analysis, such as dependent and independent variable lists. Variable names and variable labels in dialog box lists You can display either variable names or variable labels in dialog box lists, and you can control the sort order of variables in source variable lists. To control the default display attributes of variables in source lists, choose Options on the Edit menu. For more information, see the topic General options in Chapter 17 on p. You can also change the variable list display attributes within dialogs. The method for changing the display attributes depends on the dialog: If the dialog provides sorting and display controls above the source variable list, use those controls to change the display attributes. If the dialog does not contain sorting controls above the source variable list, right-click on any variable in the source list and select the display attributes from the context menu. Resizing dialog boxes You can resize dialog boxes just like windows, by clicking and dragging the outside borders or corners. For example, if you make the dialog box wider, the variable lists will also be wider. Some dialogs have a Run button instead of the OK button. Generates command syntax from the dialog box selections and pastes the syntax into a syntax window. You can then customize the commands with additional features that are not available from dialog boxes. Cancels any changes that were made in the dialog box settings since the last time it was opened and closes the dialog box. Within a session, dialog box settings are persistent. This control takes you to a Help window that contains information about the current dialog box. You can also use arrow button to move variables from the source list to the target lists. If there is only one target variable list, you can double-click individual variables to move them from the source list to the target list. Data type, measurement level, and variable list icons The icons that are displayed next to variables in dialog box lists provide information about the variable type and measurement level. For more information on numeric, string, date, and time data types, see Variable type on p. E Right-click a variable in the source or target variable list. E Choose Variable Information. All you have to do is: Get your data into SPSS Statistics. Select a procedure from the menus to calculate statistics or to create a chart. Select the variables for the analysis. Run the procedure and look at the results. Results are displayed in the Viewer. Statistics Coach If you are unfamiliar with IBM® SPSS® Statistics or with the available statistical procedures, the Statistics Coach can help you get started by prompting you with simple questions, nontechnical language, and visual examples that help you select the basic statistical and charting features that are best suited for your data. It is designed to provide general assistance for many of the basic, commonly used statistical techniques. The Help menu in most windows provides access to the main Help system, plus tutorials and technical reference material. Illustrated, step-by-step instructions on how to use many of the basic features. Hands-on examples of how to create various types of statistical analyses and how to interpret the results. After you make a series of selections, the Statistics Coach opens the dialog box for the statistical, reporting, or charting procedure that meets your selected criteria. Detailed command syntax reference information is available in two forms: integrated into the overall Help system and as a separate document in PDF form in the Command Syntax Reference, available from the Help menu. The algorithms used for most statistical procedures are available in two forms: integrated into the overall Help system and as a separate document in PDF form available on the manuals CD. In many places in the user interface, you can get context-sensitive Help. Dialog box Help buttons. Most dialog boxes have a Help button that takes you directly to a Help topic for that dialog box. The Help topic provides general information and links to related topics. © Copyright SPSS Inc. In a command syntax window, position the cursor anywhere within a syntax block for a command and press F1 on the keyboard. A complete command syntax chart for that command will be displayed. Complete command syntax documentation is available from the links in the list of related topics and from the Help Contents tab. Other Resources Technical Support Web site. The Technical Support Web site requires a login ID and password. Information on how to obtain an ID and password is provided at the URL listed above. Developer Central has resources for all levels of users and application developers. Download utilities, graphics examples, new statistical modules, and articles. E Right-click on the term that you want explained. For more information, see the topic Working with Multiple Data Sources in Chapter 6 on p. The current server name is indicated at the top of the dialog box. For more information, see the topic Distributed Analysis Mode in Chapter 4 on p. © Copyright SPSS Inc. For more information, see the topic General options in Chapter 17 on p. For information on reading data from databases, see Reading Database Files on p. Data File Types SPSS Statistics. This is available only on Windows operating systems. Each case is a record. Opening File Options Read variable names. The values are converted as necessary to create valid variable names, including converting spaces to underscores. To read a different worksheet, select the worksheet from the drop-down list. Use the same method for specifying cell ranges as you would with the spreadsheet application. Each column is a variable. If the column contains more than one data type for example, date and numeric , the data type is set to string, and all values are read as valid string values. For numeric variables, blank cells are converted to the system-missing value, indicated by a period. For string variables, a blank is a valid string value, and blank cells are treated as valid string values. Values of other types are converted to the system-missing value. For numeric variables, blank cells are converted to the system-missing value, indicated by a period. For string variables, a blank is a valid string value, and blank cells are treated as valid string values. If you do not read variable names from the spreadsheet, the column letters A, B, C,... Records marked for deletion but not actually purged are included. Stata variable names are converted to IBM® SPSS® Statistics variable names in case-sensitive form. Stata variable labels are converted to SPSS Statistics variable labels. Stata date format values are converted to SPSS Statistics DATE format d-m-y values. Reading Database Files You can read data from any database format for which you have a database driver. In local analysis mode, the necessary drivers must be installed on your local computer. In distributed analysis mode available with IBM® SPSS® Statistics Server , the drivers must be installed on the remote server. For more information, see the topic Distributed Analysis Mode in Chapter 4 on p. Note: If you are running the Windows 64-bit version of SPSS Statistics, you cannot read Excel, Access, or dBASE database sources, even though they may appear on the list of available database sources. The 32-bit ODBC drivers for these products are not compatible. E Select the data source. For OLE DB data sources available only on Windows operating systems , you can only select one table. E Specify any relationships between your tables. E Optionally: Specify any selection criteria for your data. Add a prompt for user input to create a parameter query. Save your constructed query before running it. E Follow the instructions for creating a new query. On Linux operating systems, this button is not available. For more information, see the documentation for your database drivers. In distributed analysis mode available with IBM® SPSS® Statistics Server , this button is not available. To add data sources in distributed analysis mode, see your system administrator. An ODBC data source consists of two essential pieces of information: the driver that will be used to access the data and the location of the database you want to access. To specify data sources, you must have the appropriate drivers installed. To obtain the most recent version of the. IBM® SPSS® Data Collection Survey Reporter Developer Kit. For information on obtaining a compatible version of SPSS Survey Reporter Developer Kit, go to support. The following limitations apply to OLE DB data sources: Table joins are not available for OLE DB data sources. You can read only one table at a time. To add OLE DB data sources in distributed analysis mode on a Windows server, consult your system administrator. In distributed analysis mode available with SPSS Statistics Server , OLE DB data sources are available only on Windows servers, and both. NET and SPSS Survey Reporter Developer Kit must be installed on the server. Figure 3-2 Database Wizard with access to OLE DB data sources To add an OLE DB data source: E Click Add OLE DB Data Source. E In Data Link Properties, click the Provider tab and select the OLE DB provider. E Click Next or click the Connection tab. A user name and password may also be required. E Click OK after entering all necessary information. E Enter a name for the database connection information. This name will be displayed in the list of available OLE DB data sources. Figure 3-3 Save OLE DB Connection Information As dialog box E Click OK. To add a field. To remove a field. By default, the list of available tables displays only standard database tables. You can control the type of items that are displayed in the list: Tables. Access to real system tables is often restricted to database administrators. Multiple table joins are not supported for OLE DB data sources. Figure 3-5 Database Wizard, specifying relationships 21 Data Files Establishing relationships. If outer joins are supported by your driver, you can specify inner joins, left outer joins, or right outer joins. In this example, all rows with matching ID values in the two tables will be included. In addition to one-to-one matching with inner joins, you can also use outer joins to merge tables with a one-to-many matching scheme. For example, you could match a table in which there are only a few records representing data values and associated descriptive labels with values in a table containing hundreds or thousands of records representing survey respondents. Limiting Retrieved Cases The Limit Retrieved Cases step allows you to specify the criteria to select subsets of cases rows. Criteria consist of two expressions and some relation between them. The expressions return a value of true, false, or missing for each case. If the result is true, the case is selected. If the result is false or missing, the case is not selected. Years must be expressed in four-digit form, and dates and times must contain two digits for each portion of the value. For example January 1, 2005, 1:05 AM would be expressed as: ts '2005-01-01 01:05:00' Functions. A selection of built-in arithmetic, logical, string, date, and time SQL functions is provided. You can drag a function from the list into the expression, or you can enter any valid SQL function. See your database documentation for valid SQL functions. This option selects a random sample of cases from the data source. Native random sampling, if available for the data source, is faster than IBM® SPSS® Statistics random sampling, because SPSS Statistics random sampling must still read the entire data source to extract a random sample. Note: If you use random sampling, aggregation available in distributed mode with SPSS Statistics Server is not available. You can embed a prompt in your query to create a parameter query. You might want to do this if you need to see different views of the same data. E Place your cursor in any Expression cell, and click Prompt For Value to create a prompt. Creating a Parameter Query Use the Prompt for Value step to create a dialog box that solicits information from users each time someone runs your query. This feature is useful if you want to query the same data source by using different criteria. The prompt string is displayed each time a user runs your query. The string should specify the kind of information to enter. If the user is not selecting from a list, the string should give hints about how the input should be formatted. An example is as follows: Enter a Quarter Q1, Q2, Q3,... Allow user to select value from list. If this check box is selected, you can limit the user to the values that you place here. Ensure that your values are separated by returns. Choose the data type here Number, String, or Date. E Select one or more aggregated variables. E Select an aggregate function for each aggregate variable. E Optionally, create a variable that contains the number of cases in each break group. Note: If you use SPSS Statistics random sampling, aggregation is not available. Defining Variables Variable names and labels. Click any cell to edit the variable name. Select the Recode to Numeric box for a string variable if you want to automatically convert it to a numeric variable. String values are converted to consecutive integer values based on alphabetical order of the original values. The original values are retained as value labels for the new variables. Width for variable-width string fields. This option controls the width of variable-width string values. The width can be up to 32,767 bytes. Minimize string widths based on observed values. Automatically set the width of each string variable to the longest observed value. Figure 3-10 Database Wizard, defining variables 27 Data Files Sorting Cases If you are in distributed mode, connected to a remote server available with IBM® SPSS® Statistics Server , you can sort the data before reading it into IBM® SPSS® Statistics. Figure 3-11 Database Wizard, sorting cases You can also sort data after reading it into SPSS Statistics, but presorting may save time for large data sources. Results The Results step displays the SQL Select statement for your query. You can edit the SQL Select statement before you run the query, but if you click the Back button to make changes in previous steps, the changes to the Select statement will be lost. To save the query for future use, use the Save query to file section. To paste complete GET DATA syntax into a syntax window, select Paste it into the syntax editor for further modification. Copying and pasting the Select statement from the Results window will not paste the necessary command syntax. Note: The pasted syntax contains a blank space before the closing quote on each line of SQL that is generated by the wizard. When the command is processed, all lines of the SQL statement are merged together in a very literal fashion. For example, each item in a questionnaire is a variable. How are your variables arranged? To read your data properly, the Text Wizard needs to know how to determine where the data value for one variable ends and the data value for the next variable begins. Spaces, commas, tabs, or other characters are used to separate variables. The variables are recorded in the same order for each case but not necessarily in the same column locations. No delimiter is required between variables. The column location determines which variable is being read. Are variable names included at the top of your file? A case is similar to a record in a database. For example, each respondent to a questionnaire is a case. The first case of data begins on which line number? How are your cases represented? Controls how the Text Wizard determines where each case ends and the next one begins. Each line represents a case. Each line contains only one case. If not all lines contain the same number of data values, the number of variables for each case is determined by the line with the greatest number of data values. Cases with fewer data values are assigned missing values for the additional variables. A specific number of variables represents a case. Multiple cases can be contained on the same line, and cases can start in the middle of one line and be continued on the next line. The Text Wizard determines the end of each case based on the number of values read, regardless of the number of lines. How many cases do you want to import? Text Wizard: Step 3 Fixed-Width Files Figure 3-16 Text Wizard: Step 3 for fixed-width files This step provides information about cases. A case is similar to a record in a database. For example, each respondent to questionnaire is a case. The first case of data begins on which line number? How many lines represent a case? Controls how the Text Wizard determines where each case ends and the next one begins. You need to specify the number of lines for each case to read the data correctly. Which delimiters appear between variables? Indicates the characters or symbols that separate data values. You can select any combination of spaces, commas, semicolons, tabs, or other characters. Multiple, consecutive delimiters without intervening data values are treated as missing values. What is the text qualifier? Characters used to enclose values that contain delimiter characters. Insert, move, and delete variable break lines as necessary to separate variables. If multiple lines are used for each case, the data will be displayed as one line for each case, with subsequent lines appended to the end of the line. You can overwrite the default variable names with your own variable names. Select a variable in the preview window and then enter a variable name. Select a variable in the preview window and then select a format from the drop-down list. Shift-click to select multiple contiguous variables or Ctrl-click to select multiple noncontiguous variables. If more than one format e. Text Wizard Formatting Options Formatting options for reading variables with the Text Wizard include: Do not import. Valid values include numbers, a leading plus or minus sign, and a decimal indicator. Valid values include virtually any keyboard characters and embedded blanks. Months can be represented in digits, Roman numerals, or three-letter abbreviations, or they can be fully spelled out. Select a date format from the list. Valid values are numbers with an optional leading dollar sign and optional commas as thousands separators. Valid values include numbers that use a period as a decimal indicator and commas as thousands separators. Valid values include numbers that use a comma as a decimal indicator and periods as thousands separators. Note: Values that contain invalid characters for the selected format will be treated as missing. You can also paste the syntax generated by the Text Wizard into a syntax window. Reading IBM SPSS Data Collection Data On Microsoft Windows operating systems, you can read data from IBM® SPSS® Data Collection products. Note: This feature is only available with IBM® SPSS® Statistics installed on Microsoft Windows operating systems. To read Data Collection data sources, you must have the following items installed:. To obtain the most recent version of the. IBM® SPSS® Data Collection Survey Reporter Developer Kit. For information on obtaining a compatible version of SPSS Survey Reporter Developer Kit, go to support. You can read Data Collection data sources only in local analysis mode. This feature is not available in distributed analysis mode using SPSS Statistics Server. E In the Data Collection Data Import dialog box, select the variables that you want to include and select any case selection criteria. E Click OK to read the data. Data Link Properties Connection tab To read a IBM® SPSS® Data Collection data source, you need to specify: Metadata Location. Available formats include: Quancept Data File DRS. Case data in a Quancept. Case data in a Quanvert database. Case data in a relational database in SQL Server. Data Collection XML Data File. Note: The extent to which other settings on the Connection tab or any settings on the other Data Link Properties tabs may or may not affect reading Data Collection data into IBM® SPSS® Statistics is not known, so we recommend that you do not change any of them. Select Variables tab You can select a subset of variables to read. By default, all standard variables in the data source are displayed and selected. You can then select any system variables that you want to include. By default, all system variables are excluded. You can then select any Codes variables that you want to include. By default, all Codes variables are excluded. You can then select any SourceFile variables that you want to include. By default, all SourceFile variables are excluded. Case Selection Tab For IBM® SPSS® Data Collection data sources that contain system variables, you can select cases based on a number of system variable criteria. You do not need to include the corresponding system variables in the list of variables to read, but the necessary system variables must exist in the source data to apply the selection criteria. If the necessary system variables do not exist in the source data, the corresponding selection criteria are ignored. You can select respondent data, test data, or both. For information on switching between Unicode mode and code page mode, see General Options on p. Saving Data Files in External Formats E Make the Data Editor the active window click anywhere in the window to make it active. To write variable names to the first row of a spreadsheet or tab-delimited data file: E Click Write variable names to spreadsheet in the Save Data As dialog box. To save value labels instead of data values in Excel files: E Click Save value labels where defined instead of data values in the Save Data As dialog box. To save value labels to a SAS syntax file active only when a SAS file type is selected : E Click Save value labels into a. For information on exporting data to database tables, see Exporting to a Database on p. For information on exporting data for use in IBM® SPSS® Data Collection applications, see Exporting to IBM SPSS Data Collection on p. IBM® SPSS® Statistics format. In releases prior to 10. This format is available only on Windows operating systems. Portable format that can be read by other versions of SPSS Statistics and versions on other operating systems. Variable names are limited to eight bytes and are automatically converted to unique eight-byte names if necessary. For more information, see the topic General options in Chapter 17 on p. No distinction is made between tab characters embedded in values and tab characters that separate values. If the current SPSS Statistics decimal indicator is a period, values are separated by commas. If the current decimal indicator is a comma, values are separated by semicolons. Microsoft Excel 2007 XLSX-format workbook. If the dataset contains more than one million cases, multiple sheets are created in the workbook. Microsoft Excel 97 workbook. If the dataset contains more than 65,356 cases, multiple sheets are created in the workbook. The maximum number of variables is 256, and the maximum number of rows is 16,384. The maximum number of variables that you can save is 256. The maximum number of variables that you can save is 256. The maximum number of variables that you can save is 256. The maximum number of variables that you can save is 256. SAS versions 9 for Windows. SAS versions 9 for UNIX. SAS v8 for UNIX. Excel 97 and Excel 2007 also have limits on the number of rows per sheet, but workbooks can have multiple sheets, and multiple sheets are created if the single-sheet maximum is exceeded. SPSS Statistics Variable Type Numeric Comma Dollar Date Time String Excel Data Format 0. These illegal characters are replaced with an underscore when the data are exported. SPSS Statistics variable names that contain multibyte characters for example, Japanese or Chinese characters are converted to variables names of the general form Vnnn, where nnn is an integer value. Where they exist, SPSS Statistics variable labels are mapped to the SAS variable labels. If no variable label exists in the SPSS Statistics data, the variable name is mapped to the SAS variable label. SAS allows only one value for system-missing, whereas SPSS Statistics allows numerous user-missing values in addition to system-missing. A maximum of 32,767 variables can be saved to SAS 6-8. Value labels are dropped for string variables, non-integer numeric values, and numeric values greater than an absolute value of 2,147,483,647. Any characters other than letters, numbers, and underscores are converted to underscores. IBM® SPSS® Statistics variable names that contain multibyte characters for example, Japanese or Chinese characters are converted to variable names of the general form Vnnn, where nnn is an integer value. By default, all variables will be saved. Selects only variables in variable sets currently in use. For more information, see the topic Using variable sets to show and hide variables in Chapter 16 on p. To Save a Subset of Variables E Make the Data Editor the active window click anywhere in the window to make it active. E Select the variables that you want to save. Append new records rows to a database table. Completely replace a database table or create a new table. E Follow the instructions in the export wizard to export the data. You can change the data type to any type available in the drop-down list. If there is a data type mismatch that cannot be resolved by the database, an error results and no data are exported to the database. User-missing values are treated as regular, valid, nonmissing values. Export numeric user-missing as nulls and export string user-missing values as blank spaces. Numeric user-missing values are treated the same as system-missing values. String user-missing values are converted to blank spaces strings cannot be system-missing. Note: Exporting data to OLE DB data sources is not supported. On Linux operating systems, this button is not available. For more information, see the documentation for your database drivers. In distributed analysis mode available with IBM® SPSS® Statistics Server , this button is not available. To add data sources in distributed analysis mode, see your system administrator. An ODBC data source consists of two essential pieces of information: the driver that will be used to access the data and the location of the database you want to access. To specify data sources, you must have the appropriate drivers installed. Some data sources may require a login ID and password before you can proceed to the next step. Choosing How to Export the Data After you select the data source, you indicate the manner in which you want to export the data. For more information, see the topic Replacing Values in Existing Fields on p. Add new fields to an existing table. For more information, see the topic Adding New Fields on p. Append new records to an existing table. Adds new records rows to an existing table containing the values from cases in the active dataset. For more information, see the topic Appending New Records Cases on p. Drop an existing table and create a new table of the same name. For more information, see the topic Creating a New Table or Replacing a Table on p. Create a new table. Creates a new table in the database containing data from selected variables in the active dataset. The name can be any value that is allowed as a table name by the data source. The name cannot duplicate the name of an existing table or view in the database. For more information, see the topic Creating a New Table or Replacing a Table on p. This panel in the Export to Database Wizard displays a list of tables and views in the selected database. Figure 3-24 Export to Database Wizard, selecting a table or view By default, the list displays only standard database tables. You can control the type of items that are displayed in the list: Tables. Access to real system tables is often restricted to database administrators. To delete a connection line: E Select the connection line and press the Delete key. E In the Select a table or view panel, select the database table. If there is a data type mismatch that cannot be resolved by the database, an error results and no data is exported to the database. The letter a in the icon next to a variable denotes a string variable. E In the Select a table or view panel, select the database table. Appending New Records Cases To append new records cases to a database table: E In the Choose how to export the data panel of the Export to Database Wizard, select Append new records to an existing table. E In the Select a table or view panel, select the database table. If any of these cases duplicate existing records in the database, an error may result if a duplicate key value is encountered. For information on exporting only selected cases, see Selecting Cases to Export on p. If the table name contains any characters other than letters, numbers, or an underscore, the name must be enclosed in double quotes. E If you are replacing an existing table, in the Select a table or view panel, select the database table. E Drag and drop variables into the Variables to save column. Figure 3-30 Export to Database Wizard, selecting variables for a new table Primary key. All values of the primary key must be unique or an error will result. If you select a single variable as the primary key, every record case must have a unique value for that variable. It also gives you the option of either exporting the data or pasting the underlying command syntax to a syntax window. Figure 3-31 Export to Database Wizard, finish panel Summary Information Dataset. The IBM® SPSS® Statistics session name for the dataset that will be used to export data. This information is primarily useful if you have multiple open data sources. Data sources opened using the graphical user interface for example, the Database Wizard are automatically assigned names such as DataSet1, DataSet2, etc. A data source opened using command syntax will have a dataset name only if one is explicitly assigned. For more information, see the topic Selecting Cases to Export on p. User-missing values can be exported as valid values or treated the same as system-missing for numeric variables and converted to blank spaces for string variables. This setting is controlled in the panel in which you select the variables to export. For original variables read from the Data Collection data source, any metadata attributes not recognized by SPSS Statistics are preserved in their original state. The presence or absence of value labels can affect the metadata attributes of variables and consequently the way those variables are read by Data Collection applications. This feature is only available with SPSS Statistics installed on Microsoft Windows operating systems, and is only available in local analysis mode. This feature is not available in distributed analysis mode using SPSS Statistics Server. To obtain the most recent version of the. IBM® SPSS® Data Collection Survey Reporter Developer Kit. For information on obtaining a compatible version of SPSS Survey Reporter Developer Kit, go to support. For most analysis and charting procedures, the original data source is reread each time you run a different procedure. Command syntax is not available with the Student Version. By default, this option is selected. You can turn it off by deselecting Cache data locally. For the Database Wizard, you can paste the generated command syntax and delete the CACHE command. For example, for data tables read from a database source, the SQL query that reads the information from the database must be reexecuted for any command or 61 Data Files procedure that needs to read the data. The data cache is a temporary copy of the complete data. Note: By default, the Database Wizard automatically creates a data cache, but if you use the GET DATA command in command syntax to read a database, a data cache is not automatically created. Command syntax is not available with the Student Version. E Click OK or Cache Now. For large data sources, scrolling through the contents of the Data View tab in the Data Editor will be much faster if you cache the data. Each time you start a new session, the value is reset to the default of 20. Chapter Distributed Analysis Mode 4 Distributed analysis mode allows you to use a computer other than your local or desktop computer for memory-intensive work. Any task that takes a long time in local analysis mode may be a good candidate for distributed analysis. Distributed analysis affects only data-related tasks, such as reading data, transforming data, computing new variables, and calculating statistics. Distributed analysis has no effect on tasks related to editing output, such as manipulating pivot tables or modifying charts. Note: Distributed analysis is available only if you have both a local version and access to a licensed server version of the software that is installed on a remote server. Server Login The Server Login dialog box allows you to select the computer that processes commands and runs procedures. You can select your local computer or a remote server. Figure 4-1 Server Login dialog box © Copyright SPSS Inc. Remote servers usually require a user ID and password, and a domain name may also be necessary. Contact your system administrator for information about available servers, a user ID and password, domain names, and other connection information. You can select a default server and save the user ID, domain name, and password that are associated with any server. You are automatically connected to the default server when you start a new session. If you are licensed to use the Statistics Adapter and your site is running IBM® SPSS® Collaboration and Deployment Services 3. If you are not logged on to a IBM® SPSS® Collaboration and Deployment Services Repository, you will be prompted to enter connection information before you can view the list of servers. Adding and Editing Server Login Settings Use the Server Login Settings dialog box to add or edit connection information for remote servers for use in distributed analysis mode. Figure 4-2 Server Login Settings dialog box Contact your system administrator for a list of available servers, port numbers for the servers, and additional connection information. Do not use the Secure Socket Layer unless instructed to do so by your administrator. The port number is the port that the server software uses for communications. You can enter an optional description to display in the servers list. Connect with Secure Socket Layer. Secure Socket Layer SSL encrypts requests for distributed analysis when they are sent to the remote server. Before you use SSL, check with your administrator. To select a default server: E In the server list, select the box next to the server that you want to use. E Enter the user ID, domain name, and password that were provided by your administrator. Note: You are automatically connected to the default server when you start a new session. To switch to another server: E Select the server from the list. E Enter your user ID, domain name, and password if necessary. Note: When you switch servers during a session, all open windows are closed. You will be prompted to save changes before the windows are closed. To add a server: E Get the server connection information from your administrator. E Click Add to open the Server Login Settings dialog box. E Enter the connection information and optional settings, and then click OK. To edit a server: E Get the revised connection information from your administrator. E Click Edit to open the Server Login Settings dialog box. E Enter the changes and click OK. To search for available servers: Note: The ability to search for available servers is available only if you are licensed to use the Statistics Adapter and your site is running IBM® SPSS® Collaboration and Deployment Services 3. If you are not logged on to a IBM® SPSS® Collaboration and Deployment Services Repository, you will be prompted for connection information. E Select one or more available servers and click OK. The servers will now appear in the Server Login dialog box. This dialog box appears when you click Search... Figure 4-3 Search for Servers dialog box Select one or more servers and click OK to add them to the Server Login dialog box. Although you can manually add servers in the Server Login dialog box, searching for available servers lets you connect to servers without requiring that you know the correct server name and port number. This information is automatically provided. However, you still need the correct logon information, such as user name, domain, and password. Opening Data Files from a Remote Server In distributed analysis mode, the Open Remote File dialog box replaces the standard Open File dialog box. The current server name is indicated at the top of the dialog box. File Access in Local and Distributed Analysis Mode The view of data folders directories and drives for both your local computer and the network is based on the computer that you are currently using to process commands and run procedures—which is not necessarily the computer in front of you. Although you may see familiar folder names such as Program Files and drives such as C , these items are not the folders and drives on your computer; they are the folders and drives on the remote server. In local mode, you access other devices from your local computer. In distributed mode, you access other network devices from the remote server. Availability of Procedures in Distributed Analysis Mode In distributed analysis mode, procedures are available for use only if they are installed on both your local version and the version on the remote server. Switching back to local mode will restore all affected procedures. Sharename is the folder directory on that computer that is designated as a shared folder. Path is any additional folder subdirectory path below the shared folder. UNIX Absolute Path Specifications For UNIX server versions, there is no equivalent to the UNC path, and all directory paths must be absolute paths that start at the root of the server; relative paths are not allowed. The Data Editor window opens automatically when you start a session. The Data Editor provides two views of your data: Data View. Data View Figure 5-1 Data View Many of the features of Data View are similar to the features that are found in spreadsheet applications. There are, however, several important distinctions: Rows are cases. Each row represents a case or an observation. For example, each individual respondent to a questionnaire is a case. Each column represents a variable or characteristic that is being measured. For example, each item on a questionnaire is a variable. © Copyright SPSS Inc. Each cell contains a single value of a variable for a case. The cell is where the case and the variable intersect. Cells contain only data values. Unlike spreadsheet programs, cells in the Data Editor cannot contain formulas. You can enter data in any cell. For numeric variables, blank cells are converted to the system-missing value. For string variables, a blank is considered a valid value. In Variable View: Rows are variables. Columns are variable attributes. You can also use variables in the active dataset as templates for other variables in the active dataset. Copy Data Properties is available on the Data menu in the Data Editor window. To display or define variable attributes E Make the Data Editor the active window. E Double-click a variable name at the top of the column in Data View, or click the Variable View tab. Variable names The following rules apply to variable names: Each variable name must be unique; duplication is not allowed. Subsequent characters can be any combination of letters, numbers, nonpunctuation characters, and a period. In code page mode, sixty-four bytes typically means 64 characters in single-byte languages for example, English, French, German, Spanish, Italian, Hebrew, Russian, Greek, Arabic, and Thai and 32 characters in double-byte languages for example, Japanese, Chinese, and Korean. Many string characters that only take one byte in code page mode take two or more bytes in Unicode mode. Variable names cannot contain spaces. You can only create scratch variables with command syntax. Variable names ending with a period should be avoided, since the period may be interpreted as a command terminator. You can only create variables that end with a period in command syntax. You cannot create variables that end with a period in dialog boxes that create new variables. Reserved keywords cannot be used as variable names. Reserved keywords are ALL, AND, BY, EQ, GE, GT, LE, LT, NE, NOT, OR, TO, and WITH. When long variable names need to wrap onto multiple lines in output, lines are broken at underscores, periods, and points where content changes from lower case to upper case. Variable measurement level You can specify the level of measurement as scale numeric data on an interval or ratio scale , ordinal, or nominal. Nominal and ordinal data can be either string alphanumeric or numeric. A variable can be treated as nominal when its values represent categories with no intrinsic ranking for example, the department of the company in which an employee works. A variable can be treated as scale continuous when its values represent ordered categories with a meaningful metric, so that distance comparisons between values are appropriate. Examples of scale variables include age in years and income in thousands of dollars. For example, for a string variable with the values of low, medium, high, the order of the categories is interpreted as high, low, medium, which is not the correct order. In general, it is more reliable to use numeric codes to represent ordinal data. Conditions are evaluated in the order listed in the table. The default is 24. You can change the cutoff value in the Options dialog box. For more information, see the topic Data Options in Chapter 17 on p. For more information, see the topic Assigning the Measurement Level in Chapter 7 on p. By default, all new variables are assumed to be numeric. You can use Variable Type to change the data type. The contents of the Variable Type dialog box depend on the selected data type. For some data types, there are text boxes for width and number of decimals; for other data types, you can simply select a format from a scrollable list of examples. A variable whose values are numbers. Values are displayed in standard numeric format. A numeric variable whose values are displayed with commas delimiting every three places and displayed with the period as a decimal delimiter. Values cannot contain commas to the right of the decimal indicator. A numeric variable whose values are displayed with periods delimiting every three places and with the comma as a decimal delimiter. Values cannot contain periods to the right of the decimal indicator. A numeric variable whose values are displayed with an embedded E and a signed power-of-10 exponent. The Data Editor accepts numeric values for such variables with or without an exponent. The exponent can be preceded by E or D with an optional sign or by the sign alone—for example, 123, 1. A numeric variable whose values are displayed in one of several calendar-date or clock-time formats. Select a format from the list. You can enter dates with slashes, hyphens, periods, commas, or blank spaces as delimiters. The century range for two-digit year values is determined by your Options settings from the Edit menu, choose Options, and then click the Data tab. You can enter data values with or without the leading dollar sign. A variable whose values are not numeric and therefore are not used in calculations. Uppercase and lowercase letters are considered distinct. This type is also known as an alphanumeric variable. E Select the data type in the Variable Type dialog box. Input versus display formats Depending on the format, the display of values in Data View may differ from the actual value as entered and stored internally. Following are some general guidelines: For numeric, comma, and dot formats, you can enter values with any number of decimal positions up to 16 , and the entire value is stored internally. However, the complete value is used in all computations. For string variables, all values are right-padded to the maximum width. For a string variable with a maximum width of three, a value of No is stored internally as 'No ' and is not equivalent to ' No'. For date formats, you can use slashes, dashes, spaces, commas, or periods as delimiters between day, month, and year values, and you can enter numbers, three-letter abbreviations, or complete names for month values. Dates of the general format dd-mmm-yy are displayed with dashes as delimiters and three-letter abbreviations for the month. Internally, dates are stored as the number of seconds from October 14, 1582. The century range for dates with two-digit years is determined by your Options settings from the Edit menu, choose Options, and then click the Data tab. For time formats, you can use colons, periods, or spaces as delimiters between hours, minutes, and seconds. Times are displayed with colons as delimiters. Internally, times are stored as a number of seconds that represents a time interval. For example, 10:00:00 is stored internally as 36000, which is 60 seconds per minute x 60 minutes per hour x 10 hours. Variable labels You can assign descriptive variable labels up to 256 characters 128 characters in double-byte languages. Variable labels can contain spaces and reserved characters that are not allowed in variable names. To specify variable labels E Make the Data Editor the active window. E Double-click a variable name at the top of the column in Data View, or click the Variable View tab. E In the Label cell for the variable, enter the descriptive variable label. Value labels can be up to 120 bytes. E For each value, enter the value and a label. E Click Add to enter the value label. E For variable labels, select the Label cell for the variable in Variable View in the Data Editor. E For value labels, select the Values cell for the variable in Variable View in the Data Editor, click the button in the cell, and select the label that you want to modify in the Value Labels dialog box. Figure 5-5 Missing Values dialog box You can enter up to three discrete individual missing values, a range of missing values, or a range plus one discrete value. Missing values for string variables cannot exceed eight bytes. E Enter the values or range of values that represent missing data. When you open one of these dialogs, variables that meet the role requirements will be automatically displayed in the destination list s. Available roles are: Input. The variable will be used as an input e. The variable will be used as an output or target e. The variable will be used as both input and output. The variable has no role assignment. The variable will be used to partition the data into separate samples for training, testing, and validation. Included for round-trip compatibility with IBM® SPSS® Modeler. By default, all variables are assigned the Input role. Role assignment only affects dialogs that support role assignment. It has no effect on command syntax. To assign roles E Select the role from the list in the Role cell for the variable. Column width You can specify a number of characters for the column width. Column widths can also be changed in Data View by clicking and dragging the column borders. Column width for proportional fonts is based on average character width. Column width affect only the display of values in the Data Editor. The default alignment is right for numeric variables and left for string variables. This setting affects only the display in Data View. You can: Copy a single attribute for example, value labels and paste it to the same attribute cell s for one or more variables. Copy all attributes from one variable and paste them to one or more other variables. Create multiple new variables with all the attributes of a copied variable. Applying variable definition attributes to other variables To Apply Individual Attributes from a Defined Variable E In Variable View, select the attribute cell that you want to apply to other variables. You can select multiple target variables. To apply all attributes from a defined variable E In Variable View, select the row number for the variable with the attributes that you want to use. The entire row is highlighted. You can select multiple target variables. The entire row is highlighted. E In the Paste Variables dialog box, enter the number of variables that you want to create. E Drag and drop the variables to which you want to assign the new attribute to the Selected Variables list. E Enter a name for the attribute. Attribute names must follow the same rules as variable names. For more information, see the topic Variable names on p. E Enter an optional value for the attribute. If you select multiple variables, the value is assigned to all selected variables. You can leave this blank and then enter values for each variable in Variable View. Figure 5-6 New Custom Attribute dialog box 80 Chapter 5 Display attribute in the Data Editor. Displays the attribute in Variable View of the Data Editor. For information on controlling the display of custom attributes, see Displaying and Editing Custom Variable Attributes below. Display Defined List of Attributes. Displaying and Editing Custom Variable Attributes Custom variable attributes can be displayed and edited in the Data Editor in Variable View. Figure 5-7 Custom variable attributes displayed in Variable View Custom variable attribute names are enclosed in square brackets. A blank cell indicates that the attribute does not exist for that variable; the text Empty displayed in a cell indicates that the attribute exists for that variable but no value has been assigned to the attribute for that variable. Once you enter text in the cell, the attribute exists for that variable with the value you enter. Click the button in the cell to display the list of values. E Select check the custom variable attributes you want to display. The custom variable attributes are the ones enclosed in square brackets. Variable Attribute Arrays The text Array... Click the button in the cell to display and edit the list of values. Figure 5-9 Custom Attribute Array dialog box 82 Chapter 5 Customizing Variable View You can use Customize Variable View to control which attributes are displayed in Variable View for example, name, type, label and the order in which they are displayed. Any custom variable attributes associated with the dataset are enclosed in square brackets. For more information, see the topic Creating Custom Variable Attributes on p. You can also control the default display and order of attributes in Variable View. For more information, see the topic Changing the default variable view in Chapter 17 on p. E Select check the variable attributes you want to display. E Use the up and down arrow buttons to change the display order of the attributes. Figure 5-10 Customize Variable View dialog box Restore Defaults. Apply the default display and order settings. Spell checking variable and value labels To check the spelling of variable labels and value labels: E Select the Variable View tab in the Data Editor window. This limits the spell checking to the value labels for a particular variable. Spell checking is limited to variable labels and value labels in Variable View of the Data Editor. String data values To check the spelling of string data values: E Select the Data View tab of the Data Editor. E Optionally, select one or more variables columns to check. To select a variable, click the variable name at the top of the column. If there are no string variables in the dataset or the none of the selected variables is a string variable, the Spelling option on the Utilities menu is disabled. Entering data In Data View, you can enter data directly in the Data Editor. You can enter data in any order. You can enter data by case or by variable, for selected areas or for individual cells. The active cell is highlighted. The variable name and row number of the active cell are displayed in the top left corner of the Data Editor. When you select a cell and enter a data value, the value is displayed in the cell editor at the top of the Data Editor. Data values are not recorded until you press Enter or select another cell. If you enter a value in an empty column, the Data Editor automatically creates a new variable and assigns a variable name. To enter numeric data E Select a cell in Data View. The value is displayed in the cell editor at the top of the Data Editor. E To record the value, press Enter or select another cell. To enter non-numeric data E Double-click a variable name at the top of the column in Data View or click the Variable View tab. E Click the button in the Type cell for the variable. E Select the data type in the Variable Type dialog box. E Double-click the row number or click the Data View tab. E Choose a value label from the drop-down list. The value is entered, and the value label is displayed in the cell. Note: Changing the column width does not affect the variable width. Editing data With the Data Editor, you can modify data values in Data View in many ways. You can: Change data values Cut, copy, and paste data values 85 Data Editor Add and delete cases Add and delete variables Change the order of variables Replacing or modifying data values To Delete the Old Value and Enter a New Value E In Data View, double-click the cell. The cell value is displayed in the cell editor. E Edit the value directly in the cell or in the cell editor. E Press Enter or select another cell to record the new value. Cutting, copying, and pasting data values You can cut, copy, and paste individual cell values or groups of values in the Data Editor. If no conversion is possible, the system-missing value is inserted in the target cell. Converting numeric or date into string. Numeric for example, numeric, dollar, dot, or comma and date formats are converted to strings if they are pasted into a string variable cell. The string value is the numeric value as displayed in the cell. For example, for a dollar format variable, the displayed dollar sign becomes part of the string value. Converting string into numeric or date. String values that contain acceptable characters for the numeric or date format of the target cell are converted to the equivalent numeric or date value. Converting date into numeric. Date and time values are converted to a number of seconds if the target cell is one of the numeric formats for example, numeric, dollar, dot, or comma. Because dates are stored internally as the number of seconds since October 14, 1582, converting dates to numeric values can yield some extremely large numbers. Numeric values are converted to dates or times if the value represents a number of seconds that can produce a valid date or time. For dates, numeric values that are less than 86,400 are converted to the system-missing value. Inserting new cases Entering data in a cell in a blank row automatically creates a new case. The Data Editor inserts the system-missing value for all other variables for that case. If there are any blank rows between the new case and the existing cases, the blank rows become new cases with the system-missing value for all variables. You can also insert new cases between existing cases. To insert new cases between existing cases E In Data View, select any cell in the case row below the position where you want to insert the new case. The Data Editor inserts the system-missing value for all cases for the new variable. If there are any empty columns in Data View or empty rows in Variable View between the new variable and the existing variables, these rows or columns also become new variables with the system-missing value for all cases. You can also insert new variables between existing variables. To insert new variables between existing variables E Select any cell in the variable to the right of Data View or below Variable View the position where you want to insert the new variable. To move variables E To select the variable, click the variable name in Data View or the row number for the variable in Variable View. E Drag and drop the variable to the new location. To change data type You can change the data type for a variable at any time by using the Variable Type dialog box in Variable View. The Data Editor will attempt to convert existing values to the new type. If no conversion is possible, the system-missing value is assigned. The conversion rules are the same as the rules for pasting data values to a variable with a different format type. E Enter an integer value that represents the current row number in Data View. Note: The current row number for a particular case can change due to sorting and other actions. E Enter the variable name or select the variable from the drop-down list. E Select the imputation or Original data from the drop-down list. Figure 5-12 Go To dialog box Alternatively, you can select the imputation from the drop-down list in the edit bar in Data View of the Data Editor. Figure 5-13 Data Editor with imputation markings ON Relative case position is preserved when selecting imputations. If you select imputation 2 in the dropdown, case 2034, the 34th case in imputation 2, would display at the top of the grid. If you select Original data in the dropdown, case 34 would display at the top of the grid. Column position is also preserved when navigating between imputations, so that it is easy to compare values between imputations. Finding and replacing values is restricted to a single column. The search direction is always down. For dates and times, the formatted values as displayed in Data View are searched. For other numeric variables, Contains, Begins with, and Ends with search formatted values. With the Entire cell option, the search value can be formatted or unformatted simple F numeric format , but only exact numeric values to the precision displayed in the Data Editor are matched. If value labels are displayed for the selected variable column, the label text is searched, not the underlying data value, and you cannot replace the label text. Variable View Find is only available for the Name, Label, Values, Missing, and custom variable attribute columns. Replace is only available for the Label, Values, and custom attribute columns. In the Values value labels column, the search string can match either the data value or a value label. Note: Replacing the data value will delete any previous value label associated with that value. Case selection status in the Data Editor If you have selected a subset of cases but have not discarded unselected cases, unselected cases are marked in the Data Editor with a diagonal line slash through the row number. This option controls the font characteristics of the data display. This option toggles the display of grid lines. This option is available only in Data View. Using Multiple Views In Data View, you can create multiple views panes by using the splitters that are located below the horizontal scroll bar and to the right of the vertical scroll bar. You can also use the Window menu to insert and remove pane splitters. If the top left cell is selected, splitters are inserted to divide the current view approximately in half, both horizontally and vertically. The information in the currently displayed view is printed. In Data View, the data are printed. Grid lines are printed if they are currently displayed in the selected view. Value labels are printed in Data View if they are currently displayed. Otherwise, the actual data values are printed. Use the View menu in the Data Editor window to display or hide grid lines and toggle between the display of data values and value labels. To print Data Editor contents E Make the Data Editor the active window. E Click the tab for the view that you want to print. Chapter Working with Multiple Data Sources 6 Starting with version 14. Compare the contents of different data sources. Copy and paste data between data sources. Basic Handling of Multiple Data Sources Figure 6-1 Two data sources open at same time By default, each data source that you open is displayed in a new Data Editor window. See General options for information on changing the default behavior to only display one dataset at a time, in a single Data Editor window. Any previously open data sources remain open and available for further use. © Copyright SPSS Inc. Only the variables in the active dataset are available for analysis. Figure 6-2 Variable list containing variables in the active dataset You cannot change the active dataset when any dialog box that accesses the data is open including all dialog boxes that display variable lists. At least one Data Editor window must be open during a session. Working with Multiple Datasets in Command Syntax If you use command syntax to open data sources for example, GET FILE, GET DATA , you need to use the DATASET NAME command to name each dataset explicitly in order to have more than one data source open at the same time. When working with command syntax, the active dataset name is displayed on the toolbar of the syntax window. All of the following actions can change the active dataset: Use the DATASET ACTIVATE command. Select a dataset name from the toolbar in the syntax window. Renaming Datasets When you open a data source through the menus and dialog boxes, each data source is automatically assigned a dataset name of DataSetn, where n is a sequential integer value, and when you open a data source using command syntax, no dataset name is assigned unless you explicitly specify one with DATASET NAME. E Enter a new dataset name that conforms to variable naming rules. For more information, see the topic Variable names in Chapter 5 on p. E Click the General tab. Select check Open only one dataset at a time. For more information, see the topic General options in Chapter 17 on p. Create new variables with a few distinct categories that represent ranges of values from variables with a large number of possible values. Assignment of measurement level nominal, ordinal, or scale. All of these variable properties and others can be assigned in Variable View in the Data Editor. This is particularly useful for categorical data with numeric codes used for category values. This is important for procedures in which measurement level can affect the results or determines which features are available. For more information, see the topic Setting measurement level for variables with unknown measurement level on p. For more information, see the topic Copying Data Properties on p. © Copyright SPSS Inc. E Specify the number of cases to scan to generate the list of unique values. This is primarily useful to prevent listing hundreds, thousands, or even millions of values for scale continuous interval, ratio variables. E Enter the label text for any unlabeled values that are displayed in the Value Label grid. E If there are values for which you want to create value labels but those values are not displayed, you can enter values in the Value column below the last scanned value. E Repeat this process for each listed variable for which you want to create value labels. E Click OK to apply the value labels and other variable properties. For each scanned variable, a check mark in the Unlabeled U. To sort the variable list to display all variables with unlabeled values at the top of the list: E Click the Unlabeled column heading under Scanned Variable List. You can also sort by variable name or measurement level by clicking the corresponding column heading under Scanned Variable List. Value Label Grid Label. You can add or change labels in this column. Unique values for each selected variable. This list of unique values is based on the number of scanned cases. The number of times each value occurs in the scanned cases. You can change the missing values designation of the category by clicking the check box. You can use Variable View in the Data Editor to modify the missing values categories for variables with missing values ranges. For more information, see the topic Missing values in Chapter 5 on p. Indicates that you have added or changed a value label. In addition, the Suggest button for the measurement level will be disabled. Value labels are primarily useful for categorical nominal and ordinal variables, and some procedures treat categorical and scale variables differently; so it is sometimes important to assign the correct measurement level. However, by default, all new numeric variables are assigned the scale measurement level. Thus, many variables that are in fact categorical may initially be displayed as scale. If you are unsure of what measurement level to assign to a variable, click Suggest. For more information, see the topic Roles in Chapter 5 on p. You can copy value labels and other variable properties from another variable to the currently selected variable or from the currently selected variable to one or more other variables. To create labels for unlabeled values automatically, click Automatic Labels. Variable Label and Display Format You can change the descriptive variable label and the display format. For string variables, you can change only the variable label, not the display format. For more information, see the topic Currency options in Chapter 17 on p. For example, an internal numeric value of less than 86,400 is invalid for a date format variable. The Explanation area provides a brief description of the criteria used to provide the suggested measurement level. E Click Continue to accept the suggested level of measurement or Cancel to leave the measurement level unchanged. In addition to the standard variable attributes, such as value labels, missing values, and measurement level, you can create your own custom variable attributes. Attribute names must follow the same rules as variable names. For more information, see the topic Variable names in Chapter 5 on p. The value assigned to the attribute for the selected variable. You can view the contents of a reserved attribute by clicking on the button in the desired cell. Click the button in the cell to display the list of values. E Click Copy to copy the value labels and the measurement level. Existing value labels and missing value categories for target variable s are not replaced. The measurement level for the target variable is always replaced. The role for the target variable is always replaced. These conditions apply primarily to reading data or creating new variables via command syntax. Dialogs for reading data and creating new transformed variables automatically perform a data pass that sets the measurement level, based on the default measurement level rules. To set the measurement level for variables with an unknown measurement level E In the alert dialog that appears for the procedure, click Assign Manually. A variable can be treated as nominal when its values represent categories with no intrinsic ranking for example, the department of the company in which an employee works. A variable can be treated as scale continuous when its values represent ordered categories with a meaningful metric, so that distance comparisons between values are appropriate. Examples of scale variables include age in years and income in thousands of dollars. Multiple response sets use multiple variables to record responses to questions where the respondent can give more than one answer. Multiple response sets are treated like categorical variables, and most of the things you can do with categorical variables, you can also do with multiple response sets. For more information, see the topic Copying Data Properties on p. If your variables are coded as dichotomies, indicate which value you want to have counted. E Enter a unique name for each multiple response set. The name can be up to 63 bytes long. A dollar sign is automatically added to the beginning of the set name. E Enter a descriptive label for the set. The respondent can indicate multiple choices by checking a box next to each choice. In the multiple dichotomy set, the Counted Value is 1. The Variable Coding group indicates that the variables are dichotomous. The Counted Value is 1. E Select click one of the variables in the Variables in Set list. E Right-click the variable and select Variable Information from the pop-up context menu. Figure 7-8 Variable information for multiple dichotomy source variable The value labels indicate that the variable is a dichotomy with values of 0 and 1, representing No and Yes, respectively. Categories A multiple category set consists of multiple variables, all coded the same way, often with many possible response categories. Category Label Source For multiple dichotomies, you can control how sets are labeled. Use variable label as set label. You can also use variables in the active dataset as templates for other variables in the active dataset. Variable properties include value labels, missing values, level of measurement, variable labels, print and write formats, alignment, and column width in the Data Editor. Variable properties are copied from the source variable only to target variables of a matching type—string alphanumeric or numeric including numeric, date, and currency. Note: Copy Data Properties replaces Apply Data Dictionary, formerly available on the File menu. E Follow the step-by-step instructions in the Copy Data Properties Wizard. Selecting Source and Target Variables In this step, you can specify the source variables containing the variable properties that you want to copy and the target variables that will receive those variable properties. Variable properties are copied from one or more selected source variables to matching variables in the active dataset. By default, only matching variables are displayed in the two variable lists. Create matching variables in the active dataset if they do not already exist. Apply properties from a single source variable to selected active dataset variables of the same type. Variable properties from a single selected variable in the source list can be applied to one or more selected variables in the active dataset list. Only variables of the same type numeric or string as the selected variable in the source list are displayed in the active dataset list. This option is not available if the active dataset contains no variables. Note: You cannot create new variables in the active dataset with this option. No variable properties will be applied. Choosing Variable Properties to Copy You can copy selected variable properties from the source variables to the target variables. Figure 7-11 Copy Data Properties Wizard: Step 3 Value Labels. Value labels are descriptive labels associated with data values. Value labels are often used when numeric data values are used to represent non-numeric categories for example, codes of 1 and 2 for Male and Female. You can replace or merge value labels in the target variables. For more information, see the topic Custom Variable Attributes in Chapter 5 on p. Descriptive variable labels can contain spaces and reserved characters not allowed in variable names. The measurement level can be nominal, ordinal, or scale. For more information, see the topic Roles in Chapter 5 on p. For numeric variables, this controls numeric type such as numeric, date, or currency , width total number of displayed characters, including leading and trailing characters and decimal indicator , and number of decimal places displayed. This option is ignored for string variables. This affects only alignment left, right, center in Data View in the Data Editor. Data Editor Column Width. This affects only column width in Data View in the Data Editor. Variable sets are used to control the list of variables that are displayed in dialog boxes. Merge combines documents from the source and active dataset. All documents are then sorted by date. This overrides any weighting currently in effect in the active dataset. You can also choose to paste the generated command syntax into a syntax window and save the syntax for later use. Multiple cases share a common primary ID value but have different secondary ID values, such as family members who all live in the same house. Multiple cases represent the same case but with different values for variables other than those that identify the case, such as multiple purchases made by the same person or company for different products or at different times. E Select one or more variables that identify matching cases. E Select one or more of the options in the Variables to Create group. Figure 7-14 Identify Duplicate Cases dialog box Define matching cases by. Cases are considered duplicates if their values match for all selected variables. If you want to identify only cases that are a 100% match in all respects, select all of the variables. You can select additional sorting variables that will determine the sequential order of cases in each matching group. For each sort variable, you can sort in ascending or descending order. If you select multiple sort variables, cases are sorted by each variable within categories of the preceding variable in the list. Use the up and down arrow buttons to the right of the list to change the sort order of the variables. Indicator of primary cases. Sequential count of matching cases in each group. Creates a variable with a sequential value from 1 to n for cases in each matching group. Move matching cases to the top. Display frequencies for created variables. Frequency tables containing counts for each value of the created variables. For example, for the primary indicator variable, the table would show the number of cases with a value 0 for that variable, which indicates the number of duplicates, and the number of cases with a value of 1 for that variable, which indicates the number of unique and primary cases. You can use Visual Binning to: Create categorical variables from continuous scale variables. For example, you could use a scale income variable to create a new categorical variable that contains income ranges. Collapse a large number of ordinal categories into a smaller set of categories. For example, you could collapse a rating scale of nine down to three categories representing low, medium, and high. Figure 7-15 Initial dialog box for selecting variables to bin Optionally, you can limit the number of cases to scan. Note: String variables and nominal numeric variables are not displayed in the source variable list. Visual Binning requires numeric variables, measured on either a scale or ordinal level, since it assumes that the data values represent some logical order that can be used to group values in a meaningful fashion. For more information, see the topic Variable measurement level in Chapter 5 on p. E Select a variable in the Scanned Variable List. E Enter a name for the new binned variable. Variable names must be unique and must follow variable naming rules. For more information, see the topic Variable names in Chapter 5 on p. For more information, see the topic Binning Variables on p. Binning Variables Figure 7-16 Visual Binning, main dialog box The Visual Binning main dialog box provides the following information for the scanned variables: Scanned Variable List. Displays the variables you selected in the initial dialog box. You can sort the list by measurement level scale or ordinal or by variable label or name by clicking on the column headings. Indicates the number of cases scanned. All scanned cases without user-missing or system-missing values for the selected variable are used to generate the distribution of values used in calculations in Visual Binning, including the histogram displayed in the main dialog box and cutpoints based on percentiles or standard deviation units. Indicates the number of scanned cases with user-missing or system-missing values. Missing values are not included in any of the binned categories.

Last updated