Search
 
SCRIPT & CODE EXAMPLE
 

PYTHON

# extract images from pdf file

# extract images from pdf file
import fitz
doc = fitz.open("file.pdf")
for i in range(len(doc)):
    for img in doc.getPageImageList(i):
        xref = img[0]
        pix = fitz.Pixmap(doc, xref)
        if pix.n < 5:       # this is GRAY or RGB
            pix.writePNG("p%s-%s.png" % (i, xref))
        else:               # CMYK: convert to RGB first
            pix1 = fitz.Pixmap(fitz.csRGB, pix)
            pix1.writePNG("p%s-%s.png" % (i, xref))
            pix1 = None
        pix = None
Comment

extract images from pdf

# STEP 1
# import libraries
import fitz
import io
from PIL import Image
  
# STEP 2
# file path you want to extract images from
file = "/content/pdf_file.pdf"
  
# open the file
pdf_file = fitz.open(file)
  
# STEP 3
# iterate over PDF pages
for page_index in range(len(pdf_file)):
    
    # get the page itself
    page = pdf_file[page_index]
    image_list = page.getImageList()
      
    # printing number of images found in this page
    if image_list:
        print(f"[+] Found a total of {len(image_list)} images in page {page_index}")
    else:
        print("[!] No images found on page", page_index)
    for image_index, img in enumerate(page.getImageList(), start=1):
        
        # get the XREF of the image
        xref = img[0]
          
        # extract the image bytes
        base_image = pdf_file.extractImage(xref)
        image_bytes = base_image["image"]
          
        # get the image extension
        image_ext = base_image["ext"]
Comment

PREVIOUS NEXT
Code Example
Python :: return position of a unique value in python array 
Python :: concatenating strings in python 
Python :: python form html 
Python :: convert radians to degrees python 
Python :: python serve html 
Python :: python counter 
Python :: python type hint list of possible values 
Python :: len python meaning 
Python :: remove all parentheses from string python 
Python :: http python lib 
Python :: deleting an object in python 
Python :: strip() 
Python :: datetime to unix timestamp python 
Python :: how to speed up python code 
Python :: python print empty line 
Python :: with torch.no_grad() 
Python :: django debug toolbar urlpatterns 
Python :: django 
Python :: Count the number of cells that contain a specific value in a pandas dataframe python 
Python :: pickle python 
Python :: cv2.videocapture python set frame rate 
Python :: python class destroying 
Python :: python save image pytelegrambotapi 
Python :: telegram.ext module python 
Python :: python status code to string 
Python :: ValueError: Please provide a TPU Name to connect to. site:stackoverflow.com 
Python :: enumerate 
Python :: how to draw threshold line in bar graph python 
Python :: python heighest int Value 
Python :: shibang for python file in linux 
ADD CONTENT
Topic
Content
Source link
Name
4+1 =