SAS language
The SAS language is a fourth-generation computer programming language used for statistical analysis, created by Anthony James Barr at North Carolina State University. Its primary applications include data mining and machine learning. The SAS language runs under compilers such as the SAS System that can be used on Microsoft Windows, Linux, UNIX and mainframe computers.
History
SAS was developed in the 1960s by Anthony James Barr, who built its fundamental structure, and SAS Institute CEO James Goodnight, who developed a number of features including analysis procedures. The language is currently developed and sponsored by the SAS Institute, of which Goodnight is founder and CEO.Language
Base SAS is a fourth-generation procedural programming language designed for the statistical analysis of data. It is Turing-complete and domain specific, with many of the attributes of a command language. As an interpreted language, it is generally parsed, compiled, and executed step by step. The SAS system was originally a single instruction, single data engine, but single instruction, multiple data and multiple instruction, multiple data functionality was later added. Most base SAS code can be ported between versions, but some are functions and parameters are specific to certain operating systems and interfaces.All SAS programs are written within the SAS language, although some packages use menu-driven graphical user interfaces on the front-end. Various SAS editors use color coding to identify components like step boundaries, keywords and constants. It can read in data from common spreadsheets and databases and output the results of statistical analyses in tables, graphs, and as RTF, HTML and PDF documents.
Syntax
The language consists of two main types of blocks: DATA blocks and PROC blocks. DATA blocks can be used to read and manipulate input data, and create data sets. PROC blocks are used to perform analyses and operations on these data sets, sort data, and output results in the form of descriptive statistics, tables, results, charts and plots. PROC SQL can be used to work with SQL syntax within SAS.Users can input both numeric and character data into base SAS. SAS statements must begin with a reserved keyword and end with but the language is otherwise flexible in terms of formatting and most statements are case insensitive. SAS statements can continue across multiple lines and do not require indenting, although indents can improve readability. Comments are delimited by and.
A standard SAS program typically entails the definition of data, the creation of a data set, and the performance of procedures such as analysis on that data set. SAS scripts have the.sas extension.
A simple example of SAS code is the following
- COMMENT;
input X Y Z;
datalines;
1 2 3
5 6 7
run;
PROC PRINT DATA = TEMP;
RUN;