Home page of Eric Pement | Home > awk.htm |
|
The awk programming languageawk is a programming language that gets its name from the 3 people who invented it (Aho, Weinberger, and Kernighan). Developed on a Unix operating system, its name is usually printed in lower-case ("awk") instead of capitalized ("Awk"). awk is distributed as free software, meaning that you don't have to pay anything for it and you can get the source code to build awk yourself. awk is much easier to learn than C, C++, Java, or many other languages. awk excels at handling text and data files, the kind that are created in Notepad or (for example) HTML files. You wouldn't use awk to modify a Microsoft Word document or an Excel spreadsheet. However, if you take the Word document and Save As "Text Only" or if you take the Excel spreadsheet and Save As tab-delimited (*.txt) or comma-separated (*.csv) output files, then awk could do a good job at handling them. I like awk because it's concise. The shortest awk program that does anything useful is just 1 character:
On a DOS/Windows machine, this converts Unix line endings (LF) to standard DOS line endings (CR+LF). awk programs are often called "scripts" because they don't require an intermediate stage of compiling the progam into an executable form like an *.EXE file. In fact, awk programs are almost never compiled into *.EXE files (though it's possible to do this). Thus, many people refer to awk as a "scripting language" instead of a "programming language." Normally, awk is run from a command prompt. However, if you need to run a custom awk program from the Windows desktop (usually, because you want to run the same script over and over), instead of creating a desktop shortcut to "awk.exe", create a shortcut to a script or batch file. An awk batch file for Windows could look like this: @echo off c: cd \path\to\some\directory awk.exe -f myscript.awk inputfile.txt > outputfile.txt :: All done. Show a message to the Windows user if not exist c:\temp\NUL mkdir c:\temp echo result = MsgBox("Output file successfully created",0,"File created") > c:\temp\msg.vbs %windir%\system32\cscript.exe //Nologo c:\temp\msg.vbs del /q c:\temp\msg.vbs If you work in an enterprise or commercial environment, your version
of Windows may have "cscript.exe" (a/k/a Windows Script Host) turned off
or removed from the PATH for security reasons, as it can be a vehicle for
malicious exploitation. It might be available, but just not on the
expected directory (or maybe Get precompiled binaries for awkGet awk, precompiled for Windows, from one of these locations:
NOTE: There are other versions of GNU awk for Win32, including compilations called GnuWin32 and UnxUtils (both on Sourceforge, if you want to search for them), but they are significantly older and less reliable than the ones above. There is also a compilation called DJGPP from Delorie.com (and especially here), designed to work for Intel 80386 (and higher) PCs running MS-DOS or DOS compatibles, such as PC-DOS, DR-DOS, PTS-DOS, or FreeDOS. If you are running one of these versions of DOS, you may benefit from the Delorie versions. Aside from that, the DJGPP utilities (all of them, not just awk) have one other unique feature (or benefit). Because they are written with Unix users in mind, they emulate the 'single quote' and "double quote" system of parsing command-line arguments. In other words, with Windows utilities such as EZWinPorts, Klabaster awk, Mawk, GnuWin32, and UnixUtils, parameters to the utilities must be entered in "double quotes" only. If you enter this at the DOS command line: echo Hello | sed 's/.*/&, world/' it will not be recognized as a valid command, due to the presence of the 'single quotes'. The CMD shell wants "double quotes". The Delorie utilities allow you to use 'single quotes' or "double quotes". That is the one benefit that these versions have, although there are limits. In a true Unix shell (ksh, bash, etc.), single quotes protect special characters such as the redirection arrows (<, >) and the pipe (|). The Delorie utilities do not protect these characters with single quotes. If you are running in a Microsoft Windows environment such as Windows 7, using CMD.EXE (or better, a command shell like Take Command), the GnuWin32, EZWinPorts, or Klabaster utilities are probably a better choice. Things I wrote for awk
Tutorials
Discussion forums, newsgroups
|
These pages created with
GNU Emacs,
xhtmlpp,
Take Command, and
Altap Salamander. Icons courtesy of
Qbullets
|
|