,bin.sed **************************** [ADD algorithm] Sept 9, 2003 **************************** ADD a b Assign A<-a, B<-b, R<-NIL. Assign x<-0|A0|B0, A<-shr A, B<-shr B. x isn't really a number, but a concatenation of the least significant digit from A, the least significant digit from B, and the last carry bit. Lookup x x is now an index into a table: c|A0|B0 c|R0 000 00 001 01 010 01 011 10 100 01 101 10 110 10 111 11 Assign R<-R0|R, x<-c|A0|B0, A<-shr A, B<-shr B. Lookup x and cycle until A and/or B are empty. If the last computation didn't leave a carry, we can just concatenate the remaining bits of A or B with R. One more cycle is necessary to incorporate a carry bit. *********************************************** [AFTER JUMPS] Sept 8, 2003 What to do if the command isn't recognized yet. *********************************************** When the stack frames are gone, we pass through all the jump points without success. We enter this code: s/^@/READCMD,@/ t readcmd s/^\([^,]\+\),.*/ERROR Unrecognized Command: \1,/ t error s/^/EXIT,/ t exit b error This simply calles READCMD when we hit @, causes an ERROR if something is sitting on the stack that looks like a jump address, or else exits. ******************************* [BTOD algorithm] Sept 8, 2003 The binary-to-decimal algorithm ******************************* May 30, 2005: This isn't current. b <- binary number if b=0, return 0 d <- D if negative? b, d <- D- ms <- <1:0:> s1: if b>m ms=(cons (* 10 (car ms)) ms) goto s1 s2: if ms=<0:>, return d q,r=b/m d <- d(lookup-decimal q) b <- r ms <- tail ms goto s2 ************************************** [CROSS PRODUCT algorithm] May 22, 2005 ************************************** (x1 y1 z1) X (x2 y2 z2) = (y1*z2-z1*y2 z1*x2-x1*z2 x1*y2-y1*x2) ******************************* [DIVREM algorithm] Sept 8, 2003 ******************************* DIVREM a b a<-dividend b<-divisor pad b with right zeros to size of a c<-0 od<-copy of b (divisor) s: if b=0, answer=c; remainder=a t<-a-b if t positive?, a<-t; shl c with one else shl c with zero shr b if b is not like od0*, then answer=c; remainder=a goto s ******************************** [FIRST LINE] Sept 8, 2003 Interpretation of the first line ******************************** When sed is invoked, the first line is processed with this: 01 s/^#.*// 02 s/$/,/ 03 G 04 s/\n// 05 /^,/d 06 s/,$/,@/ If this is the first command, the hold space is empty. Otherwise the hold space contains the variables of the previous commands. Line 1 removes comment lines. Any line beginning with a # is a comment. Lines 2-4 combine the pattern and hold spaces: , ,\n , Line 5 checks to see if the input line is empty. If so we clear the pattern space and read another line. Line 6 gives us two options. We can omit writing the @ when we write out program, or we can call READCMD during a program. **************************** [GCD algorithm] May 22, 2005 Greatest Common Divisor **************************** MOD (or REM) is just division, so this gets really slow. There's probably a quick MOD somewhere. gcd A 0 => A gcd A B => gcd B (A mod B) We use GCD to reduce fractions to their simplest form in FSIMPLIFY: gcd = GCD a b a/b => (a/gcd)/(b/gcd) It's happier to think of it as: a*gcd/b*gcd => a/b Since this is so heavy, I only use FNORMALIZE frequently, which keeps the signs in order, and does some quick factors of two: B10100/B1100 => B101/B11 ********************************** [INTERSECT algorithm] May 22, 2005 ********************************** From: Bourke, Paul. http://astronomy.swin.edu.au/~pbourke/geometry/sphereline/ November, 1992 Two points define a line. A point on that line (with unit vector u) is: P = P1 + u(P2-P1) or x = x1 + u(x2-x1) y = y1 + u(y2-y1) z = z1 + u(z2-z1). A point on a sphere of radius r, meanwhile, has an equation of: (x-x3)^2 + (y-y3)^2 + (z-z3)^2 = r^2. The combination of these two equations gives a solution in the form of: A*u^2 + B*u + C = 0, where A = (x2-x1)^2 + (y2-y1)^2 + (z2-z1)^2 B = 2[(x2-x1)(x1-x3) + (y2-y1)(y1-y3) + (z2-z1)(z1-z3)] C = x3^2 + y3^2 + z3^2 + x1^2 + y1^2 + z1^2 - 2[x3*x1+y3*y1+z3*z1] - r^2 One piece of the quadratic is all we need: D=B^2-4AC. If D<0, the sphere and the line do not intersect; D=0 is a tangential hit (at u=-B/(2A)); and when D>0 the line hits twice, once on each side of the sphere. ***************************** [MULT algorithm] Sept 8, 2003 ***************************** MULT and DIVREM algorithms based on class notes from the net. Jerry Breecher, Clark University CS Dept. CSCI 140 and Prof Unknown, `Lecture 11', CIS 290/291 Both gave Dave Patterson at UC Berkeley as their source. MULT a b p<-0 s: if b=0, answer=0 if a0=1, p=p+b shl b shr a goto s ********************************* [RAYTRACE algorithm] May 21, 2005 ********************************* The basic idea of raytracing is that you have an eyepoint from which you are looking (LFP, Look From Point). You are looking towards a particular point (LAP, Look At Point), hopefully at something interesting. Frame what you want to look at inside a grid of points, like looking through a wire mesh, or looking through each of the pixels of your monitor. Each pixel is calculated separately. To determine a pixel's color, calculate a ray from the LFP throught that pixel and see if it hits anything. If it misses, the pixel is the color of the background. If it hits, then return the color of the object. That gives a cartoony, 2D image. You can add depth by calculating how the ray would reflect off the object and see what that would hit, and average its color into the pixel. You can do that repeatedly for a more sophisticated image. I'm not sure how the lights fit in. Do you see if each strike point is in view of a light source and brighten it? We're looking from the LFP at the LAP. Call that vector A (A=LAP-LFP). Au is the unit vector of A (Au=unit(A)). The grid we're looking though can be anywhere as long as it's centered along A. Let's put it at a distance wd from LFP. The point at the center of the grid is the Window Point, WP (WP=LFP+wd*Au). We are given a unit vector for "up" called UP. To determine the unit vector for "right", we take the cross product of A and UP (RT=unit(A X UP)). The grid (window) has a width (WindowWidth) and a height (WindowHeight), and each of those is divided into pixels. There are WindowWidthResolution pixels along the width, and WindowHeightResolution pixels along the height. The distance between pixels along the width is pdw (pdw=WindowWidth/WindowWidthResolution), and pdh along the height (pdh=WindowHeight/WindowHeightResolution). Since we have UP and RT unit vectors along the window, we can specify each pixel relative to the center, the WP. We lengthen the RT and UP unit vectors with pdw and pdh to get a one-pixel step along the width and height (DW=pdw*RT; DH=pdh*UP). WP is in the center of the window. Using a window-specific coordinate system, WP is at (w,h)=(0,0). The upper left pixel is at (-WindowWidthResolution/2,WindowHeightResolution/2). Given w and h, we can calculate the pixel location (Pi) in the world coordinates (Pi=WP+w*DW+h*DH). A = LAP-LFP Au = unit(A) WP = LFP + wd * Au RT = unit(A X UP) pdw = WindowWidth / WindowWidthResolution DW = pdw * RT pdh = WindowHeight / WindowHeightResolution DH = pdh * UP Pi = WP + w*DW + h*DH, w = -WWR/2..(-WWR/2+WWR), h = -WHR/2..(-WHR/2+WHR) ***************************** [SQRT Algorithm] May 22, 2005 ***************************** Babylonian Method (http://en.wikipedia.org/wiki/Square_root) To find the square root (x=sqrt(r)), make a guess at x. Calculate x'=avg(r/x,x). If x==x', or is close enough, we have converged on the answer. ******* Timings ******* Sept 9, 2003 10000 operations, numbers range from +/-(0-100000) ADD 4m44s 284s 35.2 adds/sec SUB 4m38s 278s 36.0 subs/sec MULT 39m52s 2392s 4.2 mults/sec DIVREM 8m03s 483s 20.7 divrems/sec But ADD and SUB call each other when given negative arguments, so it makes sense that they average together. With numbers in the range +(0-100000), ADD 3m46s 226s 44.2 adds/sec SUB 4m59s 299s 33.4 subs/sec New algorithms for ADD and SUB. They should also impact MULT and DIVREM. ADD 2m28s 148s 67.6 adds/sec SUB 2m37s 157s 63.7 subs/sec MULT 25m28s 1528s 6.5 mults/sec DIVREM 4m49s 289s 34.6 divrems/sec May 22, 2005 RT1 4m52s 292s RT2 5m06s 306s with a bunch of prints, two spheres, just a flat one-ray image, 30x18 RT2 is off slightly and needs a tad more fraction usage June 2, 2005 positive numbers (0-100000) on cygwin on Pavilion IADD 1000 17.89s 55.90/s ISUB 1000 19.29s 51.84/s IMULT 100 18.19s 5.50/s IDIV 1000 93.07s 10.74/s FADD 1000 29.87s 33.48/s FSUB 100 5.33s 18.76/s FMULT 100 47.60s 2.10/s FDIV 100 47.41s 2.11/s June 3, 2005 Pavilion = 465MHz Celeron Pavilion, sed in cygwin, WP, PI are floats RT1 14m33s 873s June 4, 2005 Rewrote iadd,isub positive numbers (0-100000) on Pavilion cygwin IADD 10000 51.87s 192.79/s ISUB 10000 69.26s 144.38/s IMULT 1000 80.26s 12.46/s IDIV 1000 62.26s 16.06/s