How to convert office documents to PDF with powerShell

Preamble

The Microsoft Office applications are automated with COM interop(Microsoft). This is a technology that allows an application to expose its functions to host applications and other components.

When working with COM objects in PowerShell it is important to ensure all used up memory is released after the work is done. Failing to do so might result in memory leaks and unspecific COM error messages and crashes.

In the sample module this is done with the ReleaseComObject function. It accepts COM objects and releases it from memory until no more references are found.

function ReleaseComObject {
    param([__ComObject]$Object)

    if (-not $Object) { return }

    try {
        $referenceCount = 0
        do {
            $referenceCount = [System.Runtime.InteropServices.Marshal]::ReleaseComObject($Object)
        }
        while ($referenceCount -gt 0)

        $Object = $null
    }
    finally {
        [GC]::Collect()
        [GC]::WaitForPendingFinalizers()
    }
}

Word

The attached module contains three functions:

Start-Word: It imports the necessary libraries and will start one instance of Word. If an instance of Word has been started already it will return the existing instance.

Close-Word: Closes all documents without saving and the Word application. Cleans up memory.

This must be called when the Word instance isn't needed anymore. Simply killing the process will cause memory leaks. 

Export-DocToPdf: InputObject accepts a valid windows path as string or FileInfo to a Word document and attempts to export it as PDF, using the Word export feature. It returns null in case of failure and a FileInfo object of the pdf in case of success.

#region Word
function Start-Word {
    if ($Global:ApplicationWord) { return $Global:ApplicationWord }

    Add-type -AssemblyName Microsoft.Office.Interop.Word

    $Global:ApplicationWord = New-Object -ComObject Word.Application
    Start-Sleep -Seconds 2
    $Global:ApplicationWord.Visible = $false
    return $Global:ApplicationWord
}
function Close-Word {

    if (-not $Global:ApplicationWord) { return }

    if ($Global:ApplicationWord.Documents) {
        $saveChanges = $false
        foreach ($document in $Global:ApplicationWord.Documents) {
            $document.Close($saveChanges)
            ReleaseComObject -Object $document
        }
    }

    $Global:ApplicationWord.Quit([ref][Microsoft.Office.Interop.Word.WdSaveOptions]::wdDoNotSaveChanges)
    ReleaseComObject -Object $Global:ApplicationWord

    Remove-Variable -Name ApplicationWord -Scope:Global
    Start-Sleep -Seconds 1
}
function Export-DocToPdf {
    param(
        [System.IO.FileInfo]$InputObject,
        [System.IO.FileInfo]$Path
    )

    $ApplicationWord = Start-Word

    $filename = $InputObject.FullName
    $confirmConversions = $false
    $readOnly = $true
    $addToRecentFiles = $false
    $passwordDocument = ""
    $passwordTemplate = ""
    $revert = $true
    $writePasswordDocument = ""
    $writePasswordTemplate = ""
    $format = [Microsoft.Office.Interop.Word.WdOpenFormat]::wdOpenFormatAuto
    $encoding = [Microsoft.Office.Core.MsoEncoding]::msoEncodingUSASCII

    $Document = $ApplicationWord.Documents.Open([ref]$filename,    [ref]$confirmConversions,    [ref]$readOnly,    [ref]$addToRecentFiles,
        [ref]$passwordDocument, [ref]$passwordTemplate, [ref]$revert, [ref]$writePasswordDocument,
        [ref]$writePasswordTemplate, [ref]$format, [ref]$encoding)

    $Document.SaveAs([ref]$Path.FullName, [ref][Microsoft.Office.Interop.Word.WdSaveFormat]::wdFormatPDF)

    if ( Test-Path $Path ) { return Get-Item $Path }
    return $null
}
#endregion Word

Excel

The attached module contains three functions:

Start-Excel: It imports the necessary libraries and will start one instance of Excel. If an instance of Excel has been started already it will return the existing instance.

Close-Excel: Closes all workbooks without saving and the Excel application. Cleans up memory.

This must be called when the Excel instance isn't needed anymore. Simply killing the process will cause memory leaks. 

Export-ExcelToPdf: InputObject accepts a valid windows path as string or FileInfo to an Excel document and attempts to export it as PDF, using the Excel export feature. It returns null in case of failure and a FileInfo object of the pdf in case of success.

#region Excel
function Start-Excel {
    if ($Global:ApplicationExcel) { return $Global:ApplicationExcel }

    Add-type -AssemblyName Microsoft.Office.Interop.Excel

    $processExcel = Get-Process -Name EXCEL -ErrorAction SilentlyContinue
    foreach ($process in $processExcel) { $process.Kill() }

    $Global:ApplicationExcel = New-Object -ComObject Excel.Application

    Start-Sleep -Seconds 2
    $Global:ApplicationExcel.Visible = $false
    $Global:ApplicationExcel.DisplayAlerts = $false
    return $Global:ApplicationExcel
}
function Close-Excel {
    if (-not $Global:ApplicationExcel) { return }

    if ($Global:ApplicationExcel.ActiveWorkbook) {
        $saveChanges = $false
        foreach ($sheet in $ApplicationExcel.ActiveWorkbook.Sheets) {
            ReleaseComObject -Object $sheet
        }
        $Global:ApplicationExcel.ActiveWorkbook.Close($saveChanges)
        ReleaseComObject -Object $Global:ApplicationExcel.ActiveWorkbook
    }

    $Global:ApplicationExcel.Workbooks.Close()
    ReleaseComObject -Object $Global:ApplicationExcel.Workbooks

    $Global:ApplicationExcel.Quit()
    ReleaseComObject -Object $Global:ApplicationExcel

    Remove-Variable -Name ApplicationExcel -Scope:Global
    Start-Sleep -Seconds 1
}
function Export-XlsToPdf {
    param(
        [System.IO.FileInfo]$InputObject,
        [System.IO.FileInfo]$Path
    )

    $ApplicationExcel = Start-Excel

    $filename = $InputObject.FullName
    $updateLinks = 3
    $readOnly = $true
    $Workbook = $ApplicationExcel.Workbooks.Open($filename, $updateLinks, $readOnly)

    $quality = [Microsoft.Office.Interop.Excel.XlFixedFormatQuality]::xlQualityStandard
    $includeDocProperties = $true
    $destinationFilename = $Path
    $Workbook.ExportAsFixedFormat([Microsoft.Office.Interop.Excel.XlFixedFormatType]::xlTypePDF, $destinationFilename, $quality, $includeDocProperties)

    if ( Test-Path $Path.FullName ) { return Get-Item $Path.FullName }
    return $null
}
#endregion Excel

 

Powerpoint

In powerPoint the code looks different, some special parameters have to be passed when opening the presentation. "msoFalse" is the equivalent of $false in powerShell. Also in the "SaveAs" method some special parameters are needed to convert the presentation to a PDF file

$powerpnt = New-Object -ComObject PowerPoint.Application
$doc = "C:\Temp\Presentation.pptx"
$saveaspath = "C:\Temp\Presentation.pdf"
$openDoc = $powerpnt.Presentations.Open($doc,[Microsoft.Office.Core.MsoTriState]::msoFalse,[Microsoft.Office.Core.MsoTriState]::msoFalse,[Microsoft.Office.Core.MsoTriState]::msoFalse)
$openDoc.SaveAs($saveaspath , [Microsoft.Office.Interop.PowerPoint.PpSaveAsFileType]::ppSaveAsPDF,[Microsoft.Office.Core.MsoTriState]::msoFalse)
$openDoc.Close()
$powerpnt.Quit()
 

Downloads