问题描述
我正在寻找一个简单的 Java 片段来从(任何)XML 结构中删除空标签
I'm looking for a simple Java snippet to remove empty tags from a (any) XML structure
<xml> <field1>bla</field1> <field2></field2> <field3/> <structure1> <field4>bla</field4> <field5></field5> </structure1> </xml>
应该变成;
<xml> <field1>bla</field1> <structure1> <field4>bla</field4> </structure1> </xml>
推荐答案
我想知道使用 XOM 库并试一试.
I was wondering whether it would be easy to do this with the XOM library and gave it a try.
结果很简单:
import nu.xom.*; import java.io.File; import java.io.IOException; public class RemoveEmptyTags { public static void main(String[] args) throws IOException, ParsingException { Document document = new Builder().build(new File("original.xml")); handleNode(document.getRootElement()); System.out.println(document.toXML()); // empty elements now removed } private static void handleNode(Node node) { if (node.getChildCount() == 0 && "".equals(node.getValue())) { node.getParent().removeChild(node); return; } // recurse the children for (int i = 0; i < node.getChildCount(); i++) { handleNode(node.getChild(i)); } } }
这可能无法正确处理所有极端情况,例如完全空的文档.以及如何处理原本为空但具有属性的元素?
This probably won't handle all corner cases properly, like a completely empty document. And what to do about elements that are otherwise empty but have attributes?
如果要保存带有属性的 XML 标签,我们可以在方法 'handleNode' 中添加以下检查:
If you want to save XML tags with attributes, we can add in the method 'handleNode' the following check:
... && ((Element) node).getAttributeCount() == 0) )
另外,如果xml有两个或多个空标签,一个接一个;这种递归方法不会删除所有空标签!
Also, if the xml has two or more empty tags, one after another; this recursive method doesn't remove all empty tags!
(这个答案是我对 XOM 作为潜在的评估的一部分 替换为 dom4j.)
(This answer is part of my evaluation of XOM as a potential replacement to dom4j.)