I’m trying to create hadoop sequence file.
I successfully creates a sequence file into HDFS, but if i try to read a sequence file, “Sequence file not a SequenceFile” Error occurs. I also check a created sequence file in HDFS.
Here is my source code that can read and write sequence file into HDFS.
package us.qi.hdfs; import java.io.IOException; import java.net.URI; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.ArrayFile; import org.apache.hadoop.io.IOUtils; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.SequenceFile; import org.apache.hadoop.io.Text; public class SequenceFileText { public static void main(String args[]) throws IOException { /** Get Hadoop HDFS command and Hadoop Configuration*/ HDFS_Configuration conf = new HDFS_Configuration(); HDFS_Test hdfs = new HDFS_Test(); String uri = "hdfs://slave02:9000/user/hadoop/test.seq"; /** Get Configuration from HDFS_Configuration Object by using get_conf()*/ Configuration config = conf.get_conf(); SequenceFile.Writer writer = null; SequenceFile.Reader reader = null; try { Path path = new Path(uri); IntWritable key = new IntWritable(); Text value = new Text(); writer = SequenceFile.createWriter(config, SequenceFile.Writer.file(path), SequenceFile.Writer.keyClass(key.getClass()), ArrayFile.Writer.valueClass(value.getClass())); reader = new SequenceFile.Reader(config, SequenceFile.Reader.file(path)); writer.append(new IntWritable(11), new Text("test")); writer.append(new IntWritable(12), new Text("test2")); writer.close(); while (reader.next(key, value)) { System.out.println(key + "t" + value); } reader.close(); } catch (IOException e) { e.printStackTrace(); } finally { IOUtils.closeStream(writer); IOUtils.closeStream(reader); } } }
And this error is occur.
2018-09-17 17:15:34,267 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:(62)) – Unable to load native-hadoop library for your platform… using builtin-java classes where applicable 2018-09-17 17:15:38,870 INFO [main] compress.CodecPool (CodecPool.java:getCompressor(153)) – Got brand-new compressor [.deflate] java.io.EOFException: hdfs://slave02:9000/user/hadoop/test.seq not a SequenceFile at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1933) at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1892) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1841) at us.qi.hdfs.SequenceFileText.main(SequenceFileText.java:36)
Answer
That is my mistake. I change some source code.
First, i check file is already exists in hdfs. If there is not a file, I creates a writer object.
And when writer process is done, i check a sequence file. After checking file, i successfully reads a sequence file.
Here is my code. Thanks!
try { Path path = new Path(uri); IntWritable key = new IntWritable(); Text value = new Text(); /** First, Check a file already exists. * If there is not exists in hdfs, writer object is created. * */ if (!fs.exists(path)) { writer = SequenceFile.createWriter(config, SequenceFile.Writer.file(path), SequenceFile.Writer.keyClass(key.getClass()), ArrayFile.Writer.valueClass(value.getClass())); writer.append(new IntWritable(11), new Text("test")); writer.append(new IntWritable(12), new Text("test2")); writer.close(); } else { logger.info(path + " already exists."); } /** Create a SequenceFile Reader object.*/ reader = new SequenceFile.Reader(config, SequenceFile.Reader.file(path)); while (reader.next(key, value)) { System.out.println(key + "t" + value); } reader.close(); } catch (IOException e) { e.printStackTrace(); } finally { IOUtils.closeStream(writer); IOUtils.closeStream(reader); }