python 读取 word

如何用python读取word

使用Python的内部方法open（)读取文本文件try:f=open('/file','r')print(f.read())finally:if f:f.close（)如果读取word文档推荐使用第三方插件，python-docx 可以在官网上下载使用方式# -*- coding: cp936 -*-import docxdocument = docx.Document（文件路径）docText = '\n\n'.join([paragraph.text.encode('utf-8') for paragraph in document.paragraphs])print docText

Python 读取文档各行中同一列数据并按首尾相接合并输出到另一个文...

展开全部假设数据存储在文件 test.txt中，程序如下（未经测试，大概是这么个意思）lines=open(r'test.txt').readlines()text=[]for line in lines: word=line.split() thirdword=word[2].strip() text.append(thirdword)result=''.join(text)print result...

python3 读取文件夹名及内含文件名

root@localhost:~/xly/02# cat t.py import osprint(os.getcwd())print(os.listdir(＂.＂))root@localhost:~/xly/02# python t.py /root/xly/02['flash1', 'normal', 'b', 'ERR_S', 'ERR_B', 'abc.sh', 'test.sh', '1', 't.py', 'Software', 'flash2', 'c', 'ggg', 'a', 'r.py']

用python读文件并print的问题

你可以一行一行print啦for line in file:print lineraw_input()#raw_input（)就是要求用户输入，你按下回车就打印下一行了 -------------------------------补充python里可以import timetime.sleep(3)#就是程序暂停3秒#你如果把这句放在循环里，那么每次循环后就停3秒在进行下一循环。

python操作word文档,如何合并单元格

展开全部 >>> app=my.Office.Word.GetInstance()>>> doc=app.Documents[0]>>> table=doc.Tables[1]>>> table.Cell(1,1).Select()>>> app.Selection.MoveDown(Unit=5, Count=2, Extend=1)>>> app.Selection.Cells.Merge()>>>my.Office.Word.GetInstance（)用win32com得到Word的Application对象的实例我所使用的样本word文件中包含两个Table第二个Table是想要修改的table.Cell(1,1).Select（)用于选中这个样表的第一个单元格app.Selection.MoveDown用于获得向下多选取3个单元格app.Selection.Cells.Merge（)用于执行合并工作 ...

python里面的elif怎么读

在这个实例中，我将会向大家介绍如何使用Python 为 Hadoop编写一个简单的MapReduce 程序。

尽管Hadoop 框架是使用Java编写的但是我们仍然需要使用像C++、Python等语言来实现Hadoop程序。

尽管Hadoop官方网站给的示例程序是使用Jython编写并打包成Jar文件，这样显然造成了不便，其实，不一定非要这样来实现，我们可以使用Python与Hadoop 关联进行编程，看看位于/src/examples/python/WordCount.py 的例子，你将了解到我在说什么。

我们想要做什么？我们将编写一个简单的 MapReduce 程序，使用的是C-Python，而不是Jython编写后打包成jar包的程序。

我们的这个例子将模仿 WordCount 并使用Python来实现，例子通过读取文本文件来统计出单词的出现次数。

结果也以文本形式输出，每一行包含一个单词和单词出现的次数，两者中间使用制表符来想间隔。

先决条件编写这个程序之前，你学要架设好Hadoop 集群，这样才能不会在后期工作抓瞎。

如果你没有架设好，那么在后面有个简明教程来教你在Ubuntu Linux 上搭建（同样适用于其他发行版linux、unix）如何使用Hadoop Distributed File System (HDFS)在Ubuntu Linux 建立单节点的 Hadoop 集群如何使用Hadoop Distributed File System (HDFS)在Ubuntu Linux 建立多节点的 Hadoop 集群 Python的MapReduce代码使用Python编写MapReduce代码的技巧就在于我们使用了 HadoopStreaming 来帮助我们在Map 和 Reduce间传递数据通过STDIN （标准输入）和STDOUT （标准输出）.我们仅仅使用Python的sys.stdin来输入数据，使用sys.stdout输出数据，这样做是因为HadoopStreaming会帮我们办好其他事。

这是真的，别不相信！Map: mapper.py 将下列的代码保存在/home/hadoop/mapper.py中，他将从STDIN读取数据并将单词成行分隔开，生成一个列表映射单词与发生次数的关系：注意：要确保这个脚本有足够权限（chmod +x /home/hadoop/mapper.py）。

#!/usr/bin/env python import sys# input comes from STDIN (standard input) for line in sys.stdin: # remove leading and trailing whitespace line = line.strip() # split the line into words words = line.split() # increase counters for word in words: # write the results to STDOUT (standard output); # what we output here will be the input for the # Reduce step, i.e. the input for reducer.py # # tab-delimited; the trivial word count is 1 print '%s\\t%s' % (word, 1)在这个脚本中，并不计算出单词出现的总数，它将输出＂ 1＂迅速地，尽管可能会在输入中出现多次，计算是留给后来的Reduce步骤（或叫做程序）来实现。

当然你可以改变下编码风格，完全尊重你的习惯。

Reduce: reducer.py 将代码存储在/home/hadoop/reducer.py 中，这个脚本的作用是从mapper.py 的STDIN中读取结果，然后计算每个单词出现次数的总和，并输出结果到STDOUT。

同样，要注意脚本权限：chmod +x /home/hadoop/reducer.py#!/usr/bin/env python from operator import itemgetter import sys# maps words to their counts word2count = {}# input comes from STDIN for line in sys.stdin: # remove leading and trailing whitespace line = line.strip() # parse the input we got from mapper.py word, count = line.split('\\t', 1) # convert count (currently a string) to int try: count = int(count) word2count[word] = word2count.get(word, 0) + count except ValueError: # count was not a number, so silently # ignore/discard this line pass# sort the words lexigraphically;## this step is NOT required, we just do it so that our# final output will look more like the official Hadoop# word count examples sorted_word2count = sorted(word2count.items(), key=itemgetter(0))# write the results to STDOUT (standard output) for word, count in sorted_word2count: print '%s\\t%s'% (word, count) 测试你的代码（cat data | map | sort | reduce）我建议你在运行MapReduce job测试前尝试手工测试你的mapper.py 和 reducer.py脚本，以免得不到任何返回结果这里有一些建议，关于如何测试你的Map和Reduce的功能：—————————————————————————————————————————————— \r\n # very basic test hadoop@ubuntu:~$ echo ＂foo foo quux labs foo bar quux＂ | /home/hadoop/mapper.py foo 1 foo 1 quux 1 labs 1 foo 1 bar 1 —————————————————————————————————————————————— hadoop@ubuntu:~$ echo ＂foo foo quux labs foo bar quux＂ | /home/hadoop/mapper.py | sort | /home/hadoop/reducer.py bar 1 foo 3 labs 1 —————————————————————————————————————————————— # using one of the ebooks as example input # (see below on where to get the ebooks) hadoop@ubuntu:~$ cat /tmp/gutenberg/20417-8.txt | /home/hadoop/mapper.py The 1 Project 1 Gutenberg 1 EBook 1 of 1 [...] (you get the idea) quux 2 quux 1 ———————————————————————...

转载请注明出处51数据库 » python 读取 word