These notes are for Pig 0.17.0 release. Highlights ========== The highlights of this release includes Pig on Spark System Requirements =================== 1. Java 1.7.x or newer, preferably from Sun. Set JAVA_HOME to the root of your Java installation 2. Ant build tool: http://ant.apache.org - to build source only 3. Run under Unix and Windows 4. This release is compatible with Hadoop 2.5+ releases. Note Hadoop 1.X is not supported anymore. 5. For using Spark execution engine Spark 1.6.x is required. (Spark 2 support is likely to come in the next release) Trying the Release ================== 1. Download pig-0.17.0.tar.gz 2. Unpack the file: tar -xzvf pig-0.17.0.tar.gz 3. Move into the installation directory: cd pig-0.17.0 4. To run pig without Hadoop cluster, execute the command below. This will take you into an interactive shell called grunt that allows you to navigate the local file system and execute Pig commands against the local files bin/pig -x local 5. To run on your Hadoop cluster, you need to set PIG_CLASSPATH environment variable to point to the directory with your hadoop-site.xml file and then run pig. The commands below will take you into an interactive shell called grunt that allows you to navigate Hadoop DFS and execute Pig commands against it export PIG_CLASSPATH=/hadoop/conf bin/pig 6. To build your own version of pig.jar run ant 7. To run unit tests run ant test 8. To build jar file with available user defined functions run commands below. cd contrib/piggybank/java ant 9. To build the tutorial: cd tutorial ant 10. To run tutorial follow instructions in http://wiki.apache.org/pig/PigTutorial Relevant Documentation ====================== Pig Documentation: http://pig.apache.org/docs/r0.17.0/ Pig Wiki: https://cwiki.apache.org/confluence/display/PIG/Index Pig Tutorial: https://cwiki.apache.org/confluence/display/PIG/PigTutorial