Even and odd function are function that have a symmetrical property. Even functions are symmetrical along the y-axis. Odd functions have rotational symmetry around the origin. These functions have other special properties. Integrating an even function on a symmetric interval (say b) is equal to two time the integral of the function from zero to b, while the integration of an odd function on a symmetric interval equals zero.

{{{id=12| def is_function_even(f): r""" Returns True if the function is even, False if otherwise. INPUT: Symbolic expression OUTPUT: Bool EXAMPLES:: sage: is_function_even(cos(x)) True sage: is_function_even(sin(x)) False sage: f(x) = 0 sage: is_function_even(f) True sage: is_function_even(x^4+6X^3+8) False """ return bool(f(x) == f(-x)) /// }}} {{{id=82| def is_function_odd(f): r""" Returns True if the function is odd, False if otherwise. INPUT: Symbolic expression OUTPUT: Bool EXAMPLES:: sage: is_function_odd(cos(x)) False sage: is_function_odd(sin(x)) True sage: f(x) = 0 sage: is_function_odd(f) True sage: is_function_odd(x^4+6X^3+8) False """ return bool(-f(x)==f(-x)) /// }}}

This function solves the probability that a specified binomial distribution will equal a value or range of values. I design it so that the user can change from a specific value to a range easily.

{{{id=57| def binomial_prob(n,p, value, max = 0): r""" Returns the probability of a binomial distribution of a given value or range of values INPUT: - ``n`` - total number chosen from - ``p`` - probability of success of an independent Bernoulli trail - ``value`` - number of total successes or lower endpoint of the sum if `max` is used - ``max`` - upper endpoint of the sum, default = 0 or not used OUTPUT: floating number of the probability Examples:: sage: binomial_prob(100,.05,5) 0.180017827270428 sage: binomial_prob(100,.05,5,10) 0.552546289246801 sage: binomial_prob(50,.5,10,20) 0.101316568482224 sage: binomial_prob(50,.5,0,50) 1.00000000000000 """ if value>n or max > n: raise ValueError, "Value or lower bound cannot be greater than n" if max >n: raise ValueError, "Upper Bound cannot be greater than n" if p>1: raise ValueError, "Probability parameter, p cannot be greater than 1" total = binomial(n,value)*((p)^value)*((1-p)^(n-value)) # sets initial value if max != 0: # user specified a range of values for k in xrange(value+1,max+1): total += binomial(n,k)*((p)^k)*((1-p)^(n-k)) return total /// }}} {{{id=59| /// }}} {{{id=58| /// }}}

This function returns the area under a normal curve given user specified inputs. There are many uses of this curve. Once use is of estimating the probability of some event.

I tried to make this function versatile in that the user can use this function given varying degrees of input such as simply finding the area under a normal(0,1) curve to a normal with mean x and sd or variance y. The user can also input a list of raw data that will be used to compute a mean and sd.

{{{id=17| def norm_approx(min = -infinity, max= infinity, mean = 0, sd = 1, variance = 1, sample_data = []): r""" Returns the area under the standard normal curve, given user parameters INPUT: - ``min`` - the lower bound default = -oo - ``max`` - the upper bound default = oo - ``mean`` - the mean of the normal distribution default = 0 - ``sd`` - standard deviation, effect the 'width' of the curve default = 1 - ``variance`` - variance, if used this will override 'sd' default = 1 - ``sample_data`` - a list of sample data (numbers), if used overrides mean, sd, and variance OUTPUT: A floating point number equaling the area under specified normal curve. Examples:: sage: norm_approx(0,1) 0.341344746068698 sage: norm_approx(50,oo,40,variance=5) 3.87210821550318e-6 sage: norm_approx(12,15,13,1.5) 0.656296242727180 sage: norm_approx(1,5,sample_data=[3,3,3,3,3,3,3,3,3,3,3,3]) 1.00000000000000 sage: norm_approx(10,15, sample_data=[12,141,131,12,11,10,95,96,87,2,3,4,5]) 0.0301953905456431 """ if (sd or variance)<0: raise ValueError, "Standard deviation or variance cannot be less than 0" if min>max: #in most cases this is an input error print "WARNING: minimum bound is greater than maximum bound" if len(sample_data)!=0: mean = TimeSeries.mean(TimeSeries(sample_data)) #use TimeSeries object because the variable name 'mean' conflicted with the standard mean() function sd = std(sample_data) variance = 1 if variance != 1: sd = sqrt(variance) min = (min-mean)/sd max = (max-mean)/sd if variance ==0 or sd ==0: #the data is constant, has no variation min = -infinity if (min-mean)<=0 else infinity max = infinity if (max-mean)>=0 else -infinity return integral((1/sqrt(pi*2))*e^((-x^2)/2), x, min,max).n() /// }}} {{{id=62| /// }}} {{{id=61| /// }}} {{{id=23| cdata = [12,13,43,6,34,20,33,15,11,10,2,30,76,98,87,56,12,56,87,9,67,55,44] /// }}}


A confidence interval is used to state the following, we are ___ percent confident that the true population mean is between _____ and ____. The percentages used are typically 90, 95 99. But I made it so that the user can specify any confidence level, between 0 to 1. The output can be summarized into one line, that being, sample mean +- Z(ci)*sample sd/sqrt(n). Where Z(ci) is a number associated with the given confidence level, e.g at .95 Z(ci) would be about 1.96. (The exact number for Z is 2/sqrt(2) * inverse_erf(confidence level) ). To find this number a person would usually look at a standard normal table and using the area under the std normal curve take the corrsponding z value. To find this number mathematically involves, an inverse error function. Currently there is no inverse error function native to sage, though it does exist in octave, a package included in sage. This function uses an approximation related to an inverse_erf (called ltqnorm() ) written in python by Peter John Acklam. This inverse is faster than octave function in sage, and another approximation I wrote based on a formula by Sergei Winitzki. All three function are move accurate than most tables look up from a standard normal chart.

Like the norm_approx method, the user can give a list of data instead of mean, and sd.

{{{id=29| def confidence_interval(sample_mean=0,sample_sd=1, sample_variance=1,n_trials=1, ci_level=.95, sample_data=0): r""" Returns a confidence interval of the true mean of the population INPUT: - ``sample_mean`` - the sample mean default = 0 - ``sample_sd`` - the sample standard deviation default = 1 - ``sample_variance`` - the sample variance, if used overrides 'sample_sd' default = 1 - ``n_trials`` - the number of datum, trail, samples, default = 1 - ``ci_level`` - the confidence level, default = 1 - ``sample_data`` - a list of sample data (numbers), if used overrides sample_mean, sample_sd, sample_variance, and n_trials OUTPUT: A list of two number for which at the given confidence level, the true mean is in between. Examples:: sage: cdata = [12,13,43,6,34,20,33,15,11,10,2,30,76,98,87,56,12,56,87,9,67,55,44] sage: confidence_interval(sample_data=cdata,ci_level=.95) [-0.0852158254834867*sqrt(443758) + 876/23, 0.0852158254834867*sqrt(443758) + 876/23] sage: confidence_interval(sample_data=cdata,ci_level=.87) [-0.0658305169001698*sqrt(443758) + 876/23, 0.0658305169001698*sqrt(443758) + 876/23] sage: confidence_interval(43.7,4.2,n_trials=200,ci_level=.90) [-0.345419261278073*sqrt(2) + 43.7000000000000, 0.345419261278073*sqrt(2) + 43.7000000000000] sage: confidence_interval(154.87,7.5,n_trials=1000) [-0.146997298959015*sqrt(10) + 154.870000000000, 0.146997298959015*sqrt(10) + 154.870000000000] """ if (sample_sd or sample_variance)<0: raise ValueError, "Standard deviation or variance cannot be less than 0" if n_trials<=0: raise ValueError, "Number of trials cannot be less than or equal to 0" if ci_level>=1: return [-oo,oo] if sample_data != 0: sample_mean = mean(sample_data) sample_sd = std(sample_data, bias=True) #Very important that bias is true sample_variance = 1 if sample_variance != 1: sample_sd= sqrt(sample_variance) z = ltqnorm((1-ci_level)/2)*sample_sd/sqrt(n_trials) return [sample_mean+z,sample_mean-z] /// }}}

An implementation of a formula by Sergei Winitzki for approximating inverse erf. The 'a' parameter is some approximation constant, other appropriate values would be .140, though.147 give most accurate results.

{{{id=40| def inverse_erf_approx(x, a = .147): #From http://homepages.physik.uni-muenchen.de/~Winitzki/erf-approx.pdf by Sergei Winitzki #Maximum error is .00012, better than most std normal tables return ((-2/(pi*a)) - (ln(1-x^2)/2)+ sqrt( ( ((2/(pi*a)) + ln(1-x^2)/2)^2 )- ((1/a)*ln(1-x^2)) ) )^.5 /// }}} {{{id=45| print (inverse_erf_approx(.95)*2/sqrt(2)).n() timeit("(inverse_erf_approx(.95)*2/sqrt(2)).n()") /// 1.95904893802321 125 loops, best of 3: 2.69 ms per loop }}} {{{id=41| print ltqnorm(.025) timeit("ltqnorm(.025)") /// -1.95996398612019 625 loops, best of 3: 84.8 µs per loop }}} {{{id=50| print float(octave("erfinv(.95)"))*2/sqrt(2).n() timeit('float(octave("erfinv(.95)"))*2/sqrt(2).n()') /// 1.95995857609287 25 loops, best of 3: 13.6 ms per loop }}}

This function uses the area under the lower tail as input and returns back the 'x' location.

{{{id=42| def ltqnorm( p ): """ Modified from the author's original perl code (original comments follow below) by dfield@yahoo-inc.com. May 3, 2004. Lower tail quantile for standard normal distribution function. This function returns an approximation of the inverse cumulative standard normal distribution function. I.e., given P, it returns an approximation to the X satisfying P = Pr{Z <= X} where Z is a random variable from the standard normal distribution. The algorithm uses a minimax approximation by rational functions and the result has a relative error whose absolute value is less than 1.15e-9. Author: Peter John Acklam Time-stamp: 2000-07-19 18:26:14 E-mail: pjacklam@online.no WWW URL: http://home.online.no/~pjacklam """ if p <= 0 or p >= 1: # The original perl code exits here, we'll throw an exception instead raise ValueError( "Argument to ltqnorm %f must be in open interval (0,1)" % p ) # Coefficients in rational approximations. a = (-3.969683028665376e+01, 2.209460984245205e+02, \ -2.759285104469687e+02, 1.383577518672690e+02, \ -3.066479806614716e+01, 2.506628277459239e+00) b = (-5.447609879822406e+01, 1.615858368580409e+02, \ -1.556989798598866e+02, 6.680131188771972e+01, \ -1.328068155288572e+01 ) c = (-7.784894002430293e-03, -3.223964580411365e-01, \ -2.400758277161838e+00, -2.549732539343734e+00, \ 4.374664141464968e+00, 2.938163982698783e+00) d = ( 7.784695709041462e-03, 3.224671290700398e-01, \ 2.445134137142996e+00, 3.754408661907416e+00) # Define break-points. plow = 0.02425 phigh = 1 - plow # Rational approximation for lower region: if p < plow: q = sqrt(-2*math.log(p)) return (((((c[0]*q+c[1])*q+c[2])*q+c[3])*q+c[4])*q+c[5]) / \ ((((d[0]*q+d[1])*q+d[2])*q+d[3])*q+1) # Rational approximation for upper region: if phigh < p: q = sqrt(-2*math.log(1-p)) return -(((((c[0]*q+c[1])*q+c[2])*q+c[3])*q+c[4])*q+c[5]) / \ ((((d[0]*q+d[1])*q+d[2])*q+d[3])*q+1) # Rational approximation for central region: q = p - 0.5 r = q*q return (((((a[0]*r+a[1])*r+a[2])*r+a[3])*r+a[4])*r+a[5])*q / \ (((((b[0]*r+b[1])*r+b[2])*r+b[3])*r+b[4])*r+1) /// }}} {{{id=84| /// }}}