Wednesday, August 29, 2012

Finding prime factors with Pollard's rho and Rabin-Miller algorithms in Python

Like I've said in a previous post, I have a thing for prime numbers.  Factoring falls along that same line.  After I while I thought it would be interesting to write some code to return the prime factors of an input number, mostly because I wanted to see how fast my old machine would able to do it.

I decided on Pollard's rho algorithm because it is easy to code and a whole lot faster than trial division.  This would also need a primality (or compositeness) test, so since I had already coded one for my prime number generator, I stuck with Rabin-Miller.  There are some trial divisions in there, but it's just to catch the small factors.

The hardest part of this, for me, was coming up with a way for Python to handle the list of intermediate factors.  The way it's coded works, but I am sure it could be done better.  That actually applies for the rest of it as well.  An earlier version of this tore though the 8th Fermat number in a reasonable amount of time, so this works for me.

Maybe I'll update this with a different factoring algorithm and significantly cleaner code as a follow-up project.

Here's the code:

import math
import random
import fractions

################################################

def removerepeats(alist):
  
    alist.sort()
  
    for element in alist:
        while alist.count(element)>1:
            alist.remove(element)
  
    return(alist)
  
################################################

def milrabprimality(totest):

    if totest==1:
        return(-1)
  
    smallprimes=[2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,103,107,109,113,127,131,137,139,149,151,157,163,167,173,179,181,191,193,197,199,211]
  
    if totest in smallprimes:
        return(1)
  
    for smtest in smallprimes:
        if totest%smtest==0 and totest!=smtest:
            return(-1)
              
    s=totest-1
    twos=0
    while s%2==0:        #Figures out the seed and powers of two
        s=s//2
        twos+=1
      
    eventest=totest-1    #Make even -> eventest=(2^r)*s
  
    trials=10            #Change if you want to
  
    agen=0              
    a=[]              
    while agen<trials:          
        newa=random.randint(5,25)
        if newa not in a:  
            a.append(newa)      
            agen+=1
  
    trial1count=0
    totprime=0

    while trial1count<trials:        #Miller-Rabin test happens here
        primecount=0
        part1=0
          
        #Part 1
        res1=pow(a[trial1count],s,totest)
        if res1==1:
            primecount=1
            part1=1

        #Part2
        twopow=twos-1
        itercount=0
        if part1==0:                #Only do this loop if part 1 was unsuccessful
            while twopow>=(0):
                res2=pow(a[trial1count],s*(2**twopow),totest)
                twopow-=1
                itercount+=1
                if res2==(totest-1):
                    primecount+=1
                    break          
              
        if primecount!=0:
            totprime+=1
        else:
            return(-1)        #This ends it early since it has to pass for all a
  
        trial1count+=1        #increments the random base
              
    if totprime==trial1count:
        return(1)

################################################  

def pollardwrapper(number):

    factors=[]
    nonprimefactors=[]
    factors.append(number)
  
    if milrabprimality(number)==1:        #Just return it if prime
        return(factors)
  
    for factor in factors:
        if milrabprimality(factor)!=1:
            result1=pollardsrho(factor)
            result2=factor//result1
            factors.append(result1)
            factors.append(result2)
            nonprimefactors.append(factor)
          
    for element in nonprimefactors:
        factors.remove(element)
  
    factors.sort()
    factors=removerepeats(factors)
  
    return(factors)

################################################
def pollardsrho(number):
  
    if number==1:
        return(1)  
    if number==2:
        return(2)
    if number==3:
        return(3)
    if number==4:
        return(2)

    prime=milrabprimality(number)
    if prime==1:
        return(number)
      
    #Algorithm from the Crandall and Pomerance book
    a=random.randint(1,number-3)
    s=random.randint(0,number-1)
  
    u=s
    v=s
    g=1

    while g==1:
        u=((u**2)+a)%number
        v=((v**2)+a)%number
        v=((v**2)+a)%number

        g=fractions.gcd((u-v),number)    #If g!=1 then g is a factor (unless it is equal to number).
      
        if g==number:                    #The bad seed case          
            a=random.randint(1,number-3)
            s=random.randint(0,number-1)
            u=s
            v=s
            g=1          
          
        if g!=1:
            return(g)    #End of algorithm
  
################################################

tofactor=int(input('\nWhat number do you want to factor? '))
pollardfactors=pollardwrapper(tofactor)
print('The prime factors of ',tofactor,'are',pollardfactors)

Monday, August 13, 2012

Autosquiggle

I had little to do today, so I wrote come code to doodle for me:




I'm not sure why you would want to do this, but here's my code in case you want to try it out.  Play with the parameters to make different squiggles.

import matplotlib.pyplot as plt
import numpy as np

#######################################################################
def randangle(prevangle,variance):
    nextangle=np.random.normal(prevangle,variance)
    return(nextangle)
########################################################################
def thetagenerator(count):
    alltheangles=[]
    variance=3.0                #Can change this
    alltheangles.append(10.0)    #Can change this
    angcount=1
   
    while angcount<count:
        newangle=randangle(alltheangles[angcount-1],variance)
        if newangle>360:
            newangle=newangle%360
        alltheangles.append(newangle)
        angcount+=1

    return(alltheangles)
#######################################################################
       
totpoints=5000                    #Can change this
anglecount=0
x=[]
y=[]
x.append(0)        #First point
y.append(0)        #First point
radtodeg=360.0/(2*3.14159265359)
degtorad=1.0/radtodeg

segmentlength=0.01                #Can change this

angles=thetagenerator(totpoints)

points=1    #First point at (0,0)

while points<totpoints:
    currentangle=sum(angles[:points])%360
   
    x.append(x[points-1]+(segmentlength*np.cos(degtorad*currentangle)))
    y.append(y[points-1]+(segmentlength*np.sin(degtorad*currentangle)))
   
    points+=1
   
fig1=plt.figure(figsize=(10,10))    #Can change this
sub1=fig1.add_subplot(111,frameon=False,aspect="equal")
sub1.tick_params(axis='both',which='both',length=0, width=0, labelbottom=False,labeltop=False,labelleft=False,labelright=False)
sub1.plot(x,y)
plt.show()


Wednesday, August 8, 2012

Viewing Kepler Light Curves via Matplotlib and PyFITS

With all the attention that the Curiosity rover is getting, I thought I'd take a look back at the Kepler mission.  The objective of Kepler is, essentially, to discover planets outside our own Solar System.  The Kepler instrument is a space-based, wide-view telescope/photometer (see here for details).

It observes over 100,000 stars in Cygnus and Lyra, recording their brightness over time.  If a star has a planet and its orbit is aligned in the line of sight, the planet will pass across the face of the star, partially blocking its light.  The Kepler instrument is sensitive enough to detect these changes for planets of sufficient size.

The public has access to the data generated by the project.  The "meat" of the data are the light curves.  These are available via a web interface here, but I wanted to see if I could get the same curves from the raw data myself.

The raw data is available in FITS-formatted files.  FITS is a standard format used for image data but can contain non-image data and metadata as well.  For Kepler, the light curves are stored as electron fluxes read from the sensor over each observation.  Higher electron fluxes are brighter observations, and vice versa.

It turns out that there's at least one module that makes dealing with FITS files a lot easier: PyFITS.  Using PyFITS, I was able to put together a simple program to show a light curve stored in a given file.  Once you have this much, it can be a starting point to analyze the data from that source and from other sources.  My code is below: 

import matplotlib.pyplot as plt
import pyfits

filename=raw_input("\nWhat is the FITS file name? ")
FITSfile=pyfits.open(filename)

dataheader=FITSfile[1].header
topheader=FITSfile[0].header
jdadj=str(dataheader["BJDREFI"])
obsobject=str(dataheader["OBJECT"])

lightdata=FITSfile[1].data

flux=lightdata.field("PDCSAP_FLUX")     #Start with the PDCSAP_FLUX if
                                        #you're not sure which one to use.
                                        #This is the corrected electron
                                        #flux (electrons/sec) in the
                                        #optimal aperture after correction.

time=lightdata.field("TIME")            #Barycenter corrected Julian date

fig1=plt.figure()
sub1=fig1.add_subplot(111)
sub1.plot(time,flux,color="black",marker=",",linestyle="None")
xlab="Time (days, Julian date - %s)"%jdadj
sub1.set_xlabel(xlab)
sub1.set_ylabel("Brightness (electron flux)")
plottitle="Light Curve for %s"%obsobject
sub1.set_title(plottitle)

plt.show()

FITSfile.close()



Here is some sample output:

 

This is a light curve for KIC 8191672, a.k.a. Kepler-5.  The planet Kepler-5b orbits this star with a period of about 3.55 days and is roughly twice as massive as Jupiter.  The spots where the reading drops significantly from the average are the times when the planet is transiting across the face of the star, blocking a bit of the star's light.  See here and here for more details on Kepler-5b. 

The raw data by sources can be found here, listed by Kepler ID.