poi读取word样式

一、Java 利用poi 可以直接读取word中的表格保持样式生成新的word么

1.读取word 2003及word 2007需要的jar包

读取 2003 版本（.doc）的word文件相对来说比较简单，只需要 poi-3.5-beta6-20090622.jar 和 poi-scratchpad-3.5-beta6-20090622.jar 两个 jar 包即可，而 2007 版本（.docx）就麻烦多，我说的这个麻烦不是我们写代码的时候麻烦，是要导入的 jar 包比较的多，有如下 7 个之多：

1. openxml4j-bin-beta.jar

2. poi-3.5-beta6-20090622.jar

3. poi-ooxml-3.5-beta6-20090622.jar

4 .dom4j-1.6.1.jar

5. geronimo-stax-api_1.0_spec-1.0.jar

6. ooxml-schemas-1.0.jar

7. xmlbeans-2.3.0.jar

其中 4-7 是 poi-ooxml-3.5-beta6-20090622.jar 所依赖的 jar 包（在 poi-bin-3.5-beta6-20090622.tar.gz 中的 ooxml-lib 目录下可以找到）。

2.换行符号

硬换行：文件中换行，如果是键盘中使用了"enter"的换行。

软换行：文件中一行的字符数容量有限，当字符数量超过一定值时，会自动切到下行显示。

对程序来说，硬换行才是可以识别的、确定的换行，软换行与字体大小、缩进有关。

3.读取的注意事项

值得注意的是： POI 在读取不会读取 word 文件中的图片信息；还有就是对于 2007 版的 word(.docx)，如果 word 文件中有表格，所有表格中的数据都会在读取出来的字符串的最后。

4.读取word文本内容代码

1 import java.io.File;

2 import java.io.FileInputStream;

3 import java.io.InputStream;

5 import org.apache.poi.POIXMLDocument;

6 import org.apache.poi.POIXMLTextExtractor;

7 import org.apache.poi.hwpf.extractor.WordExtractor;

8 import org.apache.poi.openxml4j.opc.OPCPackage;

9 import org.apache.poi.xwpf.extractor.XWPFWordExtractor;

11 public class Test {

12 public static void main(String[] args) {

13 try {

14 InputStream is = new FileInputStream(new File("2003.doc"));

15 WordExtractor ex = new WordExtractor(is);

16 String text2003 = ex.getText();

17 System.out.println(text2003);

19 OPCPackage opcPackage = POIXMLDocument.openPackage("2007.docx");

20 POIXMLTextExtractor extractor = new XWPFWordExtractor(opcPackage);

21 String text2007 = extractor.getText();

22 System.out.println(text2007);

24 } catch (Exception e) {

25 e.printStackTrace();

26 }

27 }

28 }

二、如何样能让poi读取的word按原来的格式显示在页面

怎么样能让poi读取的word按原来的格式显示在页面

因为poi读取word 没法读取到空格和回车.这个问题要如何解决呢

poi java

------解决方案--------------------

public static void main(String[] args) {

File file = new File("D:/test.doc");

try {

FileInputStream fis = new FileInputStream(file);

HWPFDocument hwpfd = new HWPFDocument(fis);

WordExtractor wordExtractor = new WordExtractor(hwpfd);

String[] paragraph = wordExtractor.getParagraphText();

for (int i = 0; i

三、Java POI 如何操作word 格式

1、环境支持 1.1 添加poi支持：包下载地址http://www.apache.org/dyn/closer.cgi/poi/release/ 1.2 POI对Excel文件的读取操作比较方便，POI还提供对Word的DOC格式文件的读取。

但在它的发行版本中没有发布对Word支持的模块，需要另外下载一个POI的扩展的Jar包。下载地址为http://www.ibiblio.org/maven2/org/textmining/tm-extractors/0.4/ 下载extractors-0.4_zip这个文件 package com.ray.poi.util; import java.io.ByteArrayInputStream; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import org.apache.poi.poifs.filesystem.DirectoryEntry; import org.apache.poi.poifs.filesystem.DocumentEntry; import org.apache.poi.poifs.filesystem.POIFSFileSystem; import org.textmining.text.extraction.WordExtractor；/** * 读写doc * @author wangzonghao * */ public class POIWordUtil { /** * 读入doc * @param doc * @return * @throws Exception */ public static String readDoc(String doc) throws Exception { // 创建输入流读取DOC文件 FileInputStream in = new FileInputStream(new File(doc)); WordExtractor extractor = null; String text = null； // 创建WordExtractor extractor = new WordExtractor（)； // 对DOC文件进行提取 text = extractor.extractText(in); return text； } /** * 写出doc * @param path * @param content * @return */ public static boolean writeDoc(String path, String content) { boolean w = false; try { // byte b[] = content.getBytes("ISO-8859-1"); byte b[] = content.getBytes(); ByteArrayInputStream bais = new ByteArrayInputStream(b); POIFSFileSystem fs = new POIFSFileSystem(); DirectoryEntry directory = fs.getRoot(); DocumentEntry de = directory.createDocument("WordDocument", bais); FileOutputStream ostream = new FileOutputStream(path); fs.writeFilesystem(ostream); bais.close(); ostream.close(); } catch (IOException e) { e.printStackTrace(); } return w； } } 测试 package com.ray.poi.util; import junit.framework.TestCase; public class POIUtilTest extends TestCase { public void testReadDoc() { try{ String text = POIWordUtil.readDoc("E:/work_space/poi/com/ray/poi/util/demo.doc"); System.out.println(text); }catch(Exception e){ e.printStackTrace(); } } public void testWriteDoc() { String wr; try { wr = POIWordUtil.readDoc("E:/work_space/poi/com/ray/poi/util/demo.doc"); boolean b = POIWordUtil.writeDoc("c:\\demo.doc",wr); } catch (Exception e) { // TODO Auto-generated catch block e.printStackTrace(); } } }。

四、java poi导出word 可以设置格式吗

1. 读取word 2003及word 2007需要的jar包

2. 读取 2003 版本（.doc）的word文件相对来说比较简单，只需要 poi-3.5-beta6-.jar 和 poi-scratchpad-3.5-beta6-.jar 两个 jar 包即可，而 2007 版本（.docx）就麻烦多，我说的这个麻烦不是我们写代码的时候麻烦，是要导入的 jar 包比较的多，有如下 7 个之多：

3. 1. openxml4j-bin-beta.jar

4. 2. poi-3.5-beta6-.jar

5. 3. poi-ooxml-3.5-beta6-.jar

6. 4 .dom4j-1.6.1.jar

7. 5. geronimo-stax-api_1.0_spec-1.0.jar

8. 6. ooxml-schemas-1.0.jar

9. 7. xmlbeans-2.3.0.jar

10. 其中 4-7 是 poi-ooxml-3.5-beta6-.jar 所依赖的 jar 包（在 poi-bin-3.5-beta6-.tar.gz 中的 ooxml-lib 目录下可以找到）。

11. 2.换行符号

12. 硬换行：文件中换行，如果是键盘中使用了"enter"的换行。

13. 软换行：文件中一行的字符数容量有限，当字符数量超过一定值时，会自动切到下行显示。

14. 对程序来说，硬换行才是可以识别的、确定的换行，软换行与字体大小、缩进有关。

15. 3.读取的注意事项

16. 值得注意的是： POI 在读取不会读取 word 文件中的图片信息；还有就是对于 2007 版的 word(.docx)，如果 word 文件中有表格，所有表格中的数据都会在读取出来的字符串的最后。

17. 4.读取word文本内容代码

1 import java.io.File;

2 import java.io.FileInputStream;

3 import java.io.InputStream;

5 import org.apache.poi.POIXMLDocument;

6 import org.apache.poi.POIXMLTextExtractor;

7 import org.apache.poi.hwpf.extractor.WordExtractor;

8 import org.apache.poi.openxml4j.opc.OPCPackage;

9 import org.apache.poi.xwpf.extractor.XWPFWordExtractor;

11 public class Test {

12 public static void main(String[] args) {

13 try {

14 InputStream is = new FileInputStream(new File("2003.doc"));

15 WordExtractor ex = new WordExtractor(is);

16 String text2003 = ex.getText();

17 System.out.println(text2003);

19 OPCPackage opcPackage = POIXMLDocument.openPackage("2007.docx");

20 POIXMLTextExtractor extractor = new XWPFWordExtractor(opcPackage);

21 String text2007 = extractor.getText();

22 System.out.println(text2007);

24 } catch (Exception e) {

25 e.printStackTrace();

26 }

27 }

28 }

五、java读取带格式word内容

// 表格类型

List<XWPFTable> tableList = doc.getTables();

for (int i = 0; i < tableList.size(); i++) {

System.out.println(i);

XWPFTable table = tableList.get(i);

System.out.println(table.getText());

}

获取表格中内容可以用这个，但是你说的格式是什么意思，每个字的字体之类的吗？

六、如何使用JAVA,POI读写word文档

你好，试试以下代码行不行。

package com.sample; import java.awt.Color; import java.io.FileOutputStream; import java.io.IOException; import com.lowagie.text.Cell; import com.lowagie.text.Document; import com.lowagie.text.DocumentException; import com.lowagie.text.Element; import com.lowagie.text.Font; import com.lowagie.text.FontFactory; import com.lowagie.text.Image; import com.lowagie.text.PageSize; import com.lowagie.text.Paragraph; import com.lowagie.text.Phrase; import com.lowagie.text.Table; import com.lowagie.text.pdf.BaseFont; import com.lowagie.text.rtf.RtfWriter2; /** * * @author wangyanjun * @email bd_wyj@sina.com * @createDate Jun 12, 2008 */ public class CreateWordDemo { public void createDocContext(String file) throws DocumentException, IOException { // 设置纸张大小 Document document = new Document(PageSize.A4)； // 建立一个书写器（Writer）与document对象关联，通过书写器（Writer）可以将文档写入到磁盘中 RtfWriter2.getInstance(document, new FileOutputStream(file)); document.open（)； // 设置中文字体 BaseFont bfChinese = BaseFont.createFont("STSongStd-Light", "UniGB-UCS2-H", BaseFont.NOT_EMBEDDED)； // 标题字体风格 Font titleFont = new Font(bfChinese, 12, Font.BOLD)； // 正文字体风格 Font contextFont = new Font(bfChinese, 10, Font.NORMAL); Paragraph title = new Paragraph（"标题"）； // 设置标题格式对齐方式 title.setAlignment(Element.ALIGN_CENTER); title.setFont(titleFont); document.add(title); String contextString = "iText是一个能够快速产生PDF文件的java类库。" + " \n"// 换行 + "iText的java类对于那些要产生包含文本，" + "表格，图形的只读文档是很有用的。

它的类库尤其与java Servlet有很好的给合。" + "使用iText与PDF能够使你正确的控制Servlet的输出。

"； Paragraph context = new Paragraph(contextString)； // 正文格式左对齐 context.setAlignment(Element.ALIGN_LEFT); context.setFont(contextFont)； // 离上一段落（标题）空的行数 context.setSpacingBefore(5)； // 设置第一行空的列数 context.setFirstLineIndent(20); document.add(context)； //利用类FontFactory结合Font和Color可以设置各种各样字体样式 /** * Font.UNDERLINE 下划线，Font.BOLD 粗体 */ Paragraph underline = new Paragraph（"下划线的实现"， FontFactory.getFont( FontFactory.HELVETICA_BOLDOBLIQUE, 18, Font.UNDERLINE, new Color(0, 0, 255))); document.add(underline)； // 设置 Table 表格 Table aTable = new Table(3); int width[] = {25,25,50}; aTable.setWidths(width)；//设置每列所占比例 aTable.setWidth(90)； // 占页面宽度 90% aTable.setAlignment(Element.ALIGN_CENTER)；//居中显示 aTable.setAlignment(Element.ALIGN_MIDDLE)；//纵向居中显示 aTable.setAutoFillEmptyCells(true)； //自动填满 aTable.setBorderWidth(1)； //边框宽度 aTable.setBorderColor(new Color(0, 125, 255)）； //边框颜色 aTable.setPadding(2)；//衬距，看效果就知道什么意思了 aTable.setSpacing(3)；//即单元格之间的间距 aTable.setBorder(2)；//边框 //设置表头 /** * cell.setHeader(true)；是将该单元格作为表头信息显示； * cell.setColspan(3)；指定了该单元格占3列； * 为表格添加表头信息时，要注意的是一旦表头信息添加完了之后， * 必须调用 endHeaders（)方法，否则当表格跨页后，表头信息不会再显示 */ Cell haderCell = new Cell（"表格表头"）； haderCell.setHeader(true); haderCell.setColspan(3); aTable.addCell(haderCell); aTable.endHeaders(); Font fontChinese = new Font(bfChinese, 12, Font.NORMAL, Color.GREEN); Cell cell = new Cell(new Phrase（"这是一个测试的 3*3 Table 数据"， fontChinese ）); cell.setVerticalAlignment(Element.ALIGN_TOP); cell.setBorderColor(new Color(255, 0, 0)); cell.setRowspan(2); aTable.addCell(cell); aTable.addCell(new Cell("#1")); aTable.addCell(new Cell("#2")); aTable.addCell(new Cell("#3")); aTable.addCell(new Cell("#4")); Cell cell3 = new Cell(new Phrase（"一行三列数据"， fontChinese ）); cell3.setColspan(3); cell3.setVerticalAlignment(Element.ALIGN_CENTER); aTable.addCell(cell3); document.add(aTable); document.add(new Paragraph("\n")）； //添加图片 Image img=Image.getInstance("d:\\img01800.jpg"); img.setAbsolutePosition(0, 0); img.setAlignment(Image.RIGHT)；//设置图片显示位置 img.scaleAbsolute(12,35)；//直接设定显示尺寸 img.scalePercent(50)；//表示显示的大小为原尺寸的50% img.scalePercent(25, 12)；//图像高宽的显示比例 img.setRotation(30)；//图像旋转一定角度 document.add(img); document.close(); } /** * @param args */ public static void main(String[] args) { CreateWordDemo word = new CreateWordDemo(); String file = "c:/demo1.doc"; 。

七、怎么使用JAVA,POI读写word文档

如何使用JAVA、POI读写word文档？？能不能将一个word的内容完全读过来，放到一个新生成的word文件中去，要求能将word中的表格、图片等保留，格式不变。

最好能给个例子？网上多是很早以前的那个解决方法如下：，只能读文本内容，且新生成的word文件打开时总是要提示选择编码，不太好用，希望能有新的解决方案？？！！poi操作word1.1 添加poi支持：包下载地址1.2 POI对Excel文件的读取操作比较方便，POI还提供对Word的DOC格式文件的读取。但在它的发行版本中没有发布对Word支持的模块，需要另外下载一个POI的扩展的Jar包。

下载地址为；下载extractors-0.4_zip这个文件2、提取Doc文件内容public static String readDoc(String doc) throws Exception {// 创建输入流读取DOC文件FileInputStream in = new FileInputStream(new File(doc));WordExtractor extractor = null;String text = null；// 创建WordExtractorextractor = new WordExtractor（)；// 对DOC文件进行提取text = extractor.extractText(in);return text;}public static void main(String[] args) {try{String text = WordReader.readDoc("c:/test.doc");System.out.println(text);}catch(Exception e){e.printStackTrace();}}3、写入Doc文档 import java.io.ByteArrayInputStream;import java.io.FileOutputStream;import java.io.IOException;import org.apache.poi.poifs.filesystem.DirectoryEntry;import org.apache.poi.poifs.filesystem.DocumentEntry;import org.apache.poi.poifs.filesystem.POIFSFileSystem;public class WordWriter {public static boolean writeDoc(String path, String content) {boolean w = false;try { // byte b[] = content.getBytes("ISO-8859-1");byte b[] = content.getBytes(); ByteArrayInputStream bais = new ByteArrayInputStream(b); POIFSFileSystem fs = new POIFSFileSystem();DirectoryEntry directory = fs.getRoot(); DocumentEntry de = directory.createDocument("WordDocument", bais); FileOutputStream ostream = new FileOutputStream(path); fs.writeFilesystem(ostream); bais.close();ostream.close(); } catch (IOException e) {e.printStackTrace();}return w;}public static void main(String[] args) throws Exception{String wr=WordReader.readDoc("D:\\test.doc");boolean b = writeDoc("D:\\result.doc",wr);。

八、poi 根据模板导出word

ZipFile docxFile = new ZipFile(new File("c:/3.docx")); ZipEntry documentXML = docxFile.getEntry("word/document.xml"); InputStream documentXMLIS = docxFile.getInputStream(documentXML); String s = ""; InputStreamReader reader = new InputStreamReader(documentXMLIS,"UTF-8"); BufferedReader br = new BufferedReader(reader); String str = null; while ((str = br.readLine()) != null) { s = s+str; } s = s.replaceAll("${key}"， "替换内容")； System.out.println(s); reader.close(); br.close(); if(true){ //return; } //ZipEntry imgFile = docxFile.getEntry("word/media/image1.png"); DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); InputStream documentXMLIS1 = docxFile.getInputStream(documentXML); Document doc = dbf.newDocumentBuilder().parse(documentXMLIS1); Element docElement = doc.getDocumentElement(); //assertEquals("w:document", docElement.getTagName()); Element bodyElement = (Element) docElement.getElementsByTagName( "w:body").item(0); //assertEquals("w:body", bodyElement.getTagName()); Element pElement = (Element) bodyElement.getElementsByTagName("w:p") .item(0); //assertEquals("w:p", pElement.getTagName()); Element rElement = (Element) pElement.getElementsByTagName("w:r").item( 0); //assertEquals("w:r", rElement.getTagName()); Element tElement = (Element) rElement.getElementsByTagName("w:t").item( 0); //assertEquals("w:t", tElement.getTagName()); //assertEquals（"这是第一个测试文档"， tElement.getTextContent()); //tElement.setTextContent（"这是第一个用Java写的测试文档"）； Transformer t = TransformerFactory.newInstance().newTransformer(); ByteArrayOutputStream baos = new ByteArrayOutputStream(); t.transform(new DOMSource(doc), new StreamResult(baos)); ZipOutputStream docxOutFile = new ZipOutputStream(new FileOutputStream( "response.docx")); Enumeration entriesIter = (Enumeration) docxFile .entries(); while (entriesIter.hasMoreElements()) { ZipEntry entry = entriesIter.nextElement(); System.out.println(entry.getName()); if (entry.getName().equals("word/document.xml")) { byte[] data = baos.toByteArray(); docxOutFile.putNextEntry(new ZipEntry(entry.getName())); byte[] datas = s.getBytes("UTF-8"); docxOutFile.write(datas, 0, datas.length); //docxOutFile.write(data, 0, data.length); docxOutFile.closeEntry(); } else if(entry.getName().equals("word/media/image1.png")){ InputStream incoming = new FileInputStream("c:/aaa.jpg"); byte[] data = new byte[incoming.available()]; int readCount = incoming.read(data, 0, data.length); docxOutFile.putNextEntry(new ZipEntry(entry.getName())); docxOutFile.write(data, 0, readCount); docxOutFile.closeEntry(); }else { InputStream incoming = docxFile.getInputStream(entry); byte[] data = new byte[incoming.available()]; int readCount = incoming.read(data, 0, data.length); docxOutFile.putNextEntry(new ZipEntry(entry.getName())); docxOutFile.write(data, 0, readCount); docxOutFile.closeEntry(); } } docxOutFile.close();。

转载请注明出处51数据库 » poi读取word样式