VC编程提取PowerPoint中的文本 @ 10/29/2013

技术类
环境:visual studio 2008, powerpoint 2010

操作:
1、创建应用程序之后,在 Project 菜单下的 "Add Class..." 窗口中,选择 MFC 下的 "MFC Class From TypeLib",Add

2、在 "Add Class From Typelib Wizard" 窗口中,
"Add Class From" 选择 Registry
"Avaliable Type Libraries" 选择 "Microsoft PowerPoint 14.0 Object Library"
"Interfaces" 选择 "_Application, _Presentation, _Slide, Presentations, Shape, Shapes, Slides, TextFrame, TextRange" 并添加到右侧,确定
注:如果修改类名,下面的代码需要做相应调整

3、将所有头生成的文件中第一句中的 no_namespace 改为 auto_rename raw_interfaces_only auto_search rename_namespace("PowerPoint")
否则会出现如下错误:
error C2371: 'FontPtr' : redefinition; different basic types
error C2061: syntax error : identifier 'MsoRGBType'
error C4430: missing type specifier - int assumed. Note: C++ does not support default-int

代码:
在app::InitInstance()开始处添加:
if (!AfxOleInit()) {
    CString msg("Failed to initialize OLE");
    AfxMessageBox(msg);
    return FALSE;
}

在操作函数中添加:
// 头文件,设定打开文件,设定存储文件略
CString strText;
// 启动PowerPoint
CApplication app;
COleException e;
if(!app.CreateDispatch(L"Powerpoint.Application", &e)) {
    CString strError;
    strError.Format(L"CreateDispatch() failed.\nErr 0x%08lx", e.m_sc);
    AfxMessageBox(strError, MB_SETFOREGROUND);
    return;
}
app.put_Visible((long)TRUE);
// 如果嫌PowerPoint窗口烦人,可以把窗口移到桌面外面去如:
app.put_Top(2000);

// 打开 PPT 文件
CPresentations presSet = app.get_Presentations();
CPresentation pres = presSet.Open(strPptFile, 0, 0, 1);// 请设定strPptFile
// 提取每一页上的文本
CSlides slideSet(pres.get_Slides());
for(int i = 1; i < slideSet.get_Count(); i ++) {
    CSlide slide = slideSet.Range(COleVariant((long)(i + 1)));
    CShapes shapes(slide.get_Shapes());
    for(int j = 0; j < shapes.get_Count(); j ++) {
        CShape shape(shapes.Item(COleVariant((long)(j + 1))));
        if(shape.get_HasTextFrame() == Office::msoTrue) {
            CTextFrame textFrame = shape.get_TextFrame();
            CTextRange textRange = textFrame.get_TextRange();
            CString txt = textRange.get_Text();
            if(txt.GetLength() > 0) {
                strText += txt;
            }
        }
    }
}
pres.Close();
app.Quit();

// 写txt文件,请设定strTxtFile
HANDLE hTxtFile = CreateFile(strTxtFile, GENERIC_WRITE, FILE_SHARE_READ, NULL,
    CREATE_ALWAYS, NULL, NULL);
if(hTxtFile == INVALID_HANDLE_VALUE) {
    return;
}
DWORD dwWritten = 0;
// 存为unicode编码的文本文件
//WriteFile(hTxtFile, "\xFF\xFE", 2, &dwWritten, NULL);
//WriteFile(hTxtFile, strText, strText.GetLength(), &dwWritten, NULL);

// 存为utf8编码的文本文件
int u8Len = ::WideCharToMultiByte(CP_UTF8, NULL, strText, strText.GetLength(), NULL, 0, NULL, NULL);
char* szU8 = new char[u8Len];
::WideCharToMultiByte(CP_UTF8, NULL, strText, strText.GetLength(), szU8, u8Len, NULL, NULL);
WriteFile(hTxtFile, szU8, u8Len, &dwWritten, NULL);
delete[] szU8;

CloseHandle(hTxtFile);
发布于 10/29/2013 18:19:01 | 评论:0

看帖要回帖...

categories
archives
links
statistics
  • 网志数:1168
  • 评论数:2011