Grant Olson2023-09-09T00:09:13+00:00http://www.grant-olson.net/Grant Olsonkgo+site@grant-olson.netYou Should Be Using Pogo Pin Receptacles with Pogo Pins2023-09-08T00:00:00+00:00http://www.grant-olson.net/news/2023/09/08/pogo-pin-receptacles<p>There are many articles, videos, blog posts, etc, showing you how to
use pogo pins to make programming and test fixtures for your
electronics projects. But almost all are missing one critical
component to make pogo pins work reliably: the Pogo Pin Receptacle. If
you’re trying to build a fixture or jig that uses pogo pins, you
should be mounting the receptacle part to your fixture, and not
mounting the pogo pin directly.</p>
<p>For the generic pogo pins available on AliExpress, eBay, etc, sold
with designations like P50, P75, and P100, there are corresponding
parts R50, R75, and R100 that you will want to use. Unfortunately
these can be harder to find than the pogo pins, but it’s well worth
tracking the parts down and ordering.</p>
<p>It is a little easier to demonstrate how they work with a live video,
so I made this one to explain how to dramatically improve both the
ease of construction and the reliability of fixtures and jigs that use
pogo pins to make contact with your PCBAs.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/cvsxh8XzDFE" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>
Elegoo Saturn 2 - Do I remove the "Please press firmly on the build plate when tightening the screws" film?2023-05-25T00:00:00+00:00http://www.grant-olson.net/news/2023/05/25/saturn-2-protective-cover<p><strong>TLDR: Do you remove the film that says “Please press firmly on the build plate when tightening the screws”? YES</strong></p>
<p><img src="/assets/img/elegoo-build-plate.jpeg" width="100%" /></p>
<p>I started up a new firmware job at PomSafe a few months ago. Our
products are small and have some detailing that isn’t suitable for my
normal FDM 3D printers. I decided to finally break down and get a resin
printer, which I had been putting off because I wasn’t sure how messy and
dangerous the resin would be.</p>
<p>I went with an Elegoo Saturn 2. When I first got it the instructions
were good about indicating what plastic needed to be removed on the
vat, but there was a piece of plastic on the build plate that said
“Please press firmly on the build plate when tightening the screws”. I
wasn’t sure if this was supposed to be there or not. I tried picking
at it a little with my finger and thumb to see if it was supposed to
come off. It didn’t budge and it looked like it had a precision fit so
I left it on.</p>
<p>I had some success with printing: the Rook worked, the Cones of
Calibration and Ameralabs test print worked. But I had strange issues
where my custom STLs weren’t quite sticking to the build plate. I was
able to get good prints if I, for example, used a raft and tilted the
parts at 45 degrees, and set the early layer burns to 60 seconds, and
all kinds of tweaks. To complicate things more, I’ve been trying to
print out Orange Pi cases and consumer products, and a lot of the
wisdom out there for these is aimed more at the tabletop miniatures
crowd.</p>
<p>While trying different fixes I noticed that this film was starting
peel off at the edges. So I got back to wondering if this film was
supposed to come off even though it wasn’t documented anywhere.
Infuriatingly I couldn’t find any pictures of the base of the build
plate on Elegoo’s site to see what the installed plate was supposed to
look like. Searching on the exact term eventually got me to a few
pages on reddit forums where people were talking and saying the film
should come off.</p>
<p>So I’m creating this page to hopefully create something definitive
that will show high up in search results for others who have this
problem. It’s a pretty simple fix, but I thought I should at least
write up a story like those annoying recipes so that the search
engines take it seriously!</p>
<p>Hope this helps another new Saturn 2 owner out there.</p>
Can Crusher - In Depth youtube Video on the Whole Project2022-11-29T00:00:00+00:00http://www.grant-olson.net/news/2022/11/29/can-crusher-6<p>I finally completed version 1 of the can crusher. I decided that
rather than writing another lengthy post, I’d make a video detailing
my thoughts on the entire project.</p>
<p>Enjoy!</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/nmYN3EZNI_g" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>
<hr />
<p><em>All cad files, electronic files, source code, etc, referenced in this
post series is available on <a href="https://github.com/grant-olson/can-crusher">my github
page</a>.</em></p>
Can Crusher Part 5 - Mechanical Updates2022-11-11T00:00:00+00:00http://www.grant-olson.net/news/2022/11/11/can-crusher-5<p>Now that my software is in good shape and I can easily test out the
can crusher, I’m starting to see a lot of problems with the initial
design. This is expected. I just dove in and built the first prototype
without much analysis. It’s time to take another pass on the
mechanical design and get things whipped in to shape.</p>
<p>Let’s take a look one problem at a time:</p>
<h1 id="the-can-will-pop-out-of-the-device">The Can Will Pop Out of the Device</h1>
<p>The first problem was relatively simple. When the crushing element
applied pressure to the can it could come flying out of the front of
the device!</p>
<p>The fix: I made a tray that holds the can.</p>
<p><a href="/assets/img/can-crusher/can-holder.jpg"><img src="/assets/img/can-crusher/can-holder.jpg" width="50%" /></a></p>
<h1 id="acrylic-structural-elements-still-have-too-much-flex">Acrylic Structural Elements Still Have Too Much Flex</h1>
<p>The next problem was that the acrylic still flexed a lot when pressure
was applied to the can to crush it. It was much, much more rigid than
a 3D print would be, but it was flexing enough to cause problems with
the lead screws.</p>
<p>The fix: I got the structural elements made out of 1/4 inch thick
laser cut aircraft aluminum.</p>
<p>I’d never done this before but decided to give OSHCut a try. I’ve
always felt guilty for not using their sister service OSH Park to get
my PCBs made, but the bids always ended up being an order of magnitude
higher than the Chinese PCB vendors. OSHCut worked great! They have a
nice site where you can drop in a .step file and get an instant
quote. This is great if you’re like me and don’t know what the hell
you’re doing. Other companies wanted me to talk to an engineer without
even a ballpark quote!</p>
<p>The price here was a little at the high end for a hobbiest project and
will probably be out of some people’s budgets. For the frame top and
bottom, and crusher element I probably needed a 9 inch by 9 inch piece
of stock. They have a back-of-the-line price that means it takes a few
weeks to get your parts, vs 2 or 3 times as much to start production
the next day. Great for someone on a hobby budget. Still, at $126
for a single run, this will probably be the single most expensive part
of my project. But at this point I’m committed to getting results.</p>
<p>That price would be frustrating if I ordered something and I got the
dimensions wrong and it was unusable and needed to order again. If I
want to do aluminum parts in the future I think I’ll probably still do
a test run on the CNC with cast Acrylic first, and make sure it’s
perfect before risking a bad run on the laser cut aluminum.</p>
<p>Side-by-side: 3D printed version, acrylic version, and final aluminum version. And the installed platform.</p>
<p><a href="/assets/img/can-crusher/mounted-aluminum-crusher-element.jpg"><img src="/assets/img/can-crusher/mounted-aluminum-crusher-element.jpg" width="50%" style="float:right;" /></a>
<a href="/assets/img/can-crusher/print-cnc-laser-cut.jpg"><img src="/assets/img/can-crusher/print-cnc-laser-cut.jpg" width="50%" /></a></p>
<h1 id="the-steppers-dont-create-nearly-enough-crushing-force">The Steppers Don’t Create Nearly Enough Crushing Force</h1>
<p>After getting a solid platform that was didn’t flex, the next problem
was that I didn’t have nearly enough crushing force. I expected things
to be difficult, and anticipated having to add something the crushes
the can sidewalls before applying pressure from the top. But even a
can partially crushed by hand wasn’t getting anywhere.</p>
<p>One option to fix this would be to just buy bigger an bigger stepper
motors until we had power. I’m currently using NEMA 17 sized
motors. But that seemed a bit excessive.</p>
<p>After some research I learned about the different threading available
on lead screws.</p>
<blockquote>
<p><strong>The More You Know…™</strong></p>
<p>I was always confused that these lead screws were
referred to as <strong>trapezoidal</strong> when they are clearly <strong>round</strong>.
It turns out this refers to the shape of the thread. Rather than
coming to a sharp point like a normal screw these are flattened
for a good amount of the thread. 1 mm on my 2 mm threads. This
creates a more robust mating with the nut that moves the platform
up and down.</p>
</blockquote>
<p><a href="/assets/img/can-crusher/lead-size-comparison.jpg"><img src="/assets/img/can-crusher/lead-size-comparison.jpg" width="50%" style="float:right;" /></a></p>
<p>My normal 3D printer lead screws had a 2mm pitch, but they had 4
‘starts’. This means there are 4 sets of threads instead of 1 like a
normal screw. Each of the thread are intertwined like the stripes on a
candy cane. (Not sure if that analogy is useful?) So a single rotation
of the stepper motor causes the screws to move the platform up or down
8 mm instead of 2. They do however sell lead screws with 1,2, or
sometimes even 3 ‘starts’. I was able to buy some screws with a single
start, meaning a rotation now moves the platform 2 mm.</p>
<p>This should give 4 times the force from the same motors, with the
downside being that the crusher element moves at 1/4 the
speed. Installing these finally got me to my first crush of a can that
was even close to acceptable! If I want to get even more power there
are screws with a 1mm lead.</p>
<p>Note the image on the right. These rods have the same thread size, but
the one on the left has a steeper angle. This means it has more starts
and goes further per revolution.</p>
<h1 id="the-crushing-platform-has-many-alignment-problems">The Crushing Platform Has Many Alignment Problems</h1>
<p><a href="/assets/img/can-crusher/rails.jpg"><img src="/assets/img/can-crusher/rails.jpg" width="50%" style="float:right;" /></a></p>
<p>The next problem was that the crushing power would often cause the
screws to come out of alignment, and caused the crusher element to
stop being level. With the 2 mm screws this would become so severe the
system would bind. On the 8 mm screws I could power down the stepper
motors and spin the screws manually to re-level the element. But with
the 2 mm screws I had to actually remove the steppers from there
holders to release enough stress to allow me to align things several
times. That’s obviously not going to work.</p>
<p>One design element on my 3D printers started making a lot more sense.
The higher end ones have straight rods of hardened steel that guide
and align the parts. Bearings reduce friction. The screws then have a
lot more leeway in terms of their positioning. They are moving the
platform up and down, but they are no longer responsible for part
alignment.</p>
<p>I added a straight rod on each side on the back of the structure. Then
I 3D printed out some holders for some vertical bearings. This meant
that I needed a new crusher element that didn’t interfere with the
rods and had mounting holes for the bearing holders.</p>
<p>Which unfortunately meant another order from OSHCut a week after the
first batch came in! After some test 3D prints and acrylic, I was back
to OSHCut for a redux in aircraft aluminum.</p>
<h1 id="the-lead-screws-have-alignment-problems">The Lead Screws Have Alignment Problems</h1>
<p>The last big problem was that it was difficult to get proper alignment
of the lead screws. I imagine this was always a problem, but it was
particularly obvious once the straight rods were helping keep the
platform straight.</p>
<p>As I moved down the platform it would get harder to spin the lead
screws. Just a little bit on most of the way down, but it got
especially hard at the very bottom. This was a big problem because my
stepper drivers use current values to determine if they have stalled
out. The difference was enough that the value changed depending on
whether I was at the bottom of the platform, in the middle or at the
top, making it impossible to do stall detection or auto-home reliably.</p>
<p>I was getting in to an issue where we were dealing with tight
tolerances. It seemed like I was < 0.5 mm from getting the lead screws
straight. I could have kept moving things around just a little,
printing, and testing the range of motion. That would be
time-consuming and wouldn’t really fix the problem. There are lots of
things that introduce tolerance errors:</p>
<ul>
<li>My 3D prints only have 0.4 mm resolution. Parts my be microscopically off
from a different 3D printer.</li>
<li>My professionally cut laser parts seem to have more accurate and slightly
different sizing.</li>
<li>I’m not using precision assembly or measurement. Parts are held together
with T-Slot screws. Having the left and right side servo holders vertical
alignment off by 0.1 mm or less, might change the proper location of
the motors.</li>
<li>I could get everything perfect, then drop the thing on a concrete floor,
lose a small amount of alignment. and be back to printing new variations.</li>
</ul>
<p>I needed a solution where I can manually square the device, rather than
assuming all my parts are printed at perfect sizes, assembled in
the same way, and aligned, to account for all these various factors.</p>
<p><a href="/assets/img/can-crusher/stepper-holder-with-slots.png"><img src="/assets/img/can-crusher/stepper-holder-with-slots.png" width="50%" style="float:right;" /></a></p>
<p>I changed the mounting holes for the stepper motor in slots and came
up with a procedure to square the machine. I would do an initial
install of the steppers, keep the screws loose, then manually lower
the crusher platform as far as it would go. This would move the
steppers in to proper alignment and I could tighten them. Then I could
do a few manual tests and make sure that I felt the same tension
across the range of motion, and slowly lock the screw down.</p>
<p>That introduced the problem that the rod holders I placed on the top
of the unit had a strict value that couldn’t be adjusted. And slots
don’t make sense because they may need some play on both the X and Y
axis’. Looking at my 3D printer for inspiration I noticed that it
didn’t really even hold lead screws the lead screws in place at the
top. The lead screw holders at the top of the lead screws have several
millimeters of clearance and are there to just be there to catch the
screws if something goes really, really, really wrong and the machine
is falling apart. I decided to borrow that approach.</p>
<p>With all these fixes I was able to go back to running my test script
to find appropriate stall current settings. Things were much more
reliable. I produced the same values no matter where I was vertically
on the device. And the results were reproducable. The downside is that
I may need to occasionally re-square the device if it gets a lot of
use.</p>
<h1 id="and-back-to-the-rod-lead-size">And Back to the Rod Lead Size</h1>
<p>Once my new batch of aluminum for the crusher plates arrived I
installed it. I haven’t mentioned this before, but I currently have
two can crushers. The first one lets me do a lot of rough testing and
the second has a more stable configuration. One is the alpha and the
other is the beta version of the product.</p>
<p>I upgraded my alpha crusher first. This one still had the
fast-yet-less-powerful 8 mm lead screws. It turns out that with all
the rigidity in place, and all the changes I made, these were actually
much, much better at crushing cans than they were in earlier
tests. The slower-more-powerful lead screws on the beta crusher had
been extremely slow, maxing out at 8-9 mm per second, and very
difficult to dial in the stall settings.</p>
<p>Based on that I decided to go back to the original 8 mm lead screws
and use those as I move forward on the project.</p>
<h1 id="next-up">Next Up</h1>
<p>I now have a mostly working system but I’m running it via a USB cable
on my computer manually. I originally expected to have a high level
control computer talking to my low level Pico PCB. This would allow
the high-level computer to handle the user interface and more
complicated calculations. I’ve decided it’s time to take a first pass
at that. My goal get to the point where I have a self-contained unit
that can run at the press of a button, and not a CLI script. Then I’ll
have my first full version of the entire stand-alone can crusher.</p>
<hr />
<p><em>All cad files, electronic files, source code, etc, referenced in this
post series is available on <a href="https://github.com/grant-olson/can-crusher">my github
page</a>.</em></p>
Can Crusher Part 4 - Towards Production Software2022-10-31T00:00:00+00:00http://www.grant-olson.net/news/2022/10/31/can-crusher-4<p>At this point I have a working dev board that makes it much easier to
write code than it was on the breadboard. I’m able to deploy code
faster and the can crusher is less precarious, but development is
still slow going. I’m going to start working on higher quality
firmware, focusing on a control interface that allows me to send
commands over UART to the board. This will make it much easier to try
things out as I develop the system.</p>
<p>There’s always a trade off between abstractions and system
control. Python is a great language for development because I can
write code and test very quickly. But it’s not such a good language
for embedded systems because I lose a lot of predictability and speed
that’s required for motor control, sensors, etc. For that you need a
lower level language which slows development time.</p>
<p>The plan is to keep a smaller core of low-level code that’s easy to
follow, and then add an interface so that I can user higher level code
while testing various things out. When all is said and done, then I
either have a pretty good reference implementation of the higher level
activities to port to C, or I can just keep the higher level code
running on another control board that isn’t as sensitive to real-time
requirements.</p>
<p>While I’m working on the software I also want to see if I can take
advantage of the PIO feature on Pico, but I’ll get in to those details later in the post. For now, the…</p>
<h1 id="command-interface">Command Interface</h1>
<h2 id="basic-design">Basic Design</h2>
<ul>
<li>
<p>Send commands over UART.</p>
</li>
<li>
<p>Should be extremely simple language so I don’t waste time writing an
elaborate parser in C.</p>
</li>
<li>
<p>Although I will be able to control directly from a terminal, it’s
anticipated I’ll end up using some sort of middle-ware in python
to build and send commands. The language doesn’t need to be pretty.</p>
</li>
<li>
<p>The interface should be able to report success or failure.</p>
</li>
<li>
<p>The interface be able to return results as data when needed.</p>
</li>
</ul>
<blockquote>
<p><strong>Getting real.</strong> I’m doing this project for fun and to stretch my
skills. I’m pretending to be each and every member of a team
building out a full product.</p>
<p>If this was a real product, there’s a
perfectly good motor control language called <em>Gcode</em> that probably
covers all the features listed above and more. Even better there’s a
great open source implementation called <a href="https://marlinfw.org/">Marlin</a>.
It’s targeted at 3D
printers but has been customized to run a lot of machines. And on
top of that there are many control boards,
<a href="https://www.aliexpress.com/wholesale?SearchText=marlin+controller">some as cheap as $10</a>,
that can run Marlin.</p>
<p>Realistically, I should just be using both that hardware and software
to control my device.
It would dramatically simplify development time. But since this is a
learning experience we’ll pretend none of that exists.</p>
</blockquote>
<h2 id="control-interface">Control Interface</h2>
<p>The goal here is to keep things as simple as possible. We won’t have
anything like variables, conditionals, loops, turing-completeness, etc. We’ll just send a
message and get something back. We will also make it very easy to parse.</p>
<p>At the core the format will be either <code class="language-plaintext highlighter-rouge">COMMAND</code> to do something or
<code class="language-plaintext highlighter-rouge">COMMAND ARG1 ARG2</code> if we want to send data. There may be a return
value in the form of <code class="language-plaintext highlighter-rouge">COMMAND: VALUE</code> There will always be an
indication of success or failure with either <code class="language-plaintext highlighter-rouge">OK</code> or <code class="language-plaintext highlighter-rouge">ERR 132</code></p>
<p>An example communication session would be something like:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>> WAKE
OK
> HOME
OK
> POSITION?
POSITION: 150
OK
> MOVE 50 10
ERR 109
> POSITION?
POSITION: 173.9
OK
> MOVE -125 20
OK
> SLEEP
OK
</code></pre></div></div>
<p>In this session I:</p>
<ul>
<li>
<p>Wake up the device so the motors are powered. Stepper can run hot
and shouldn’t be left on 24/7.</p>
</li>
<li>
<p>Auto-home the device so we know where the crushing platform is.</p>
</li>
<li>
<p>Check the current position, in millimeters, from the base after homing.</p>
</li>
<li>
<p>Move the platform up 50 mm at 10 mm per second, <strong>but</strong> we encounter
an error indicating that the motors have stalled, <strong>meaning</strong> we
tried to go higher than is possible and hit the ceiling.</p>
</li>
<li>
<p>Check to see how fare we really moved. 150 - 173.9 means I moved
23.9 mm before failing.</p>
</li>
<li>
<p>Move the platform down 20 millimeters at 10 mm per second, so I’m
not touching the ceiling.</p>
</li>
<li>
<p>Go back to sleep to save power.</p>
</li>
</ul>
<p>This is extremely simple to parse but I can still do everything I need
to build up a complex system.</p>
<h2 id="property-bag">Property bag</h2>
<p>The interface dramatically improves things. While testing it quickly
becomes apparent that I still need to recompile to change default
values. Now it’s time to throw all of those values in to a property
bag.</p>
<p>I <a href="https://github.com/grant-olson/can-crusher/blob/main/firmware/pico/property.c">create an interface</a> that allows you to get and set all the
properties in a centralized location. It allows you to do this with
either a C <code class="language-plaintext highlighter-rouge">enum</code> or the name of the property. Then I add a quick
interface to the language <code class="language-plaintext highlighter-rouge">PROP= PROPERTY_NAME PROPERTY_VALUE</code> to set
and <code class="language-plaintext highlighter-rouge">PROP? PROPERTY_NAME</code> to get.</p>
<p>I add this and move all the appropriate properties to this system and
things are looking good. Saving this to the Pico’s flash so it
survives reboots should be simple. I just need to:</p>
<ul>
<li>Choose a region of flash.</li>
<li>Add some magic value to the beginning so I know if it’s been initialized yet.</li>
<li>Add a version number so the code can be smart when we add more values.</li>
<li>Save the raw values.</li>
</ul>
<p>The Pico as usual has a great API and documentation on doing this but
I ran in to a few problems.</p>
<h3 id="memory-offsets">Memory Offsets</h3>
<p>First I didn’t realize that to read data you are supposed to read from
an absolute address, and to write data you are supposed to write to a
relative address on the external flash storage:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="cp">#define PROP_OFFSET (1024 * 1024 * 2) - FLASH_SECTOR_SIZE
#define PROP_ADDRESS ( XIP_BASE + PROP_OFFSET)
</span>
<span class="c1">// Read access - USES ABSOLUTE ADDRESS</span>
<span class="k">const</span> <span class="kt">uint8_t</span><span class="o">*</span> <span class="n">flash_bytes</span> <span class="o">=</span> <span class="p">(</span><span class="k">const</span> <span class="kt">uint8_t</span> <span class="o">*</span><span class="p">)</span> <span class="n">PROP_ADDRESS</span><span class="p">;</span>
<span class="c1">// Write access - USES RELATIVE OFFSET</span>
<span class="kt">uint8_t</span> <span class="n">data</span><span class="p">[</span><span class="n">FLASH_PAGE_SIZE</span><span class="p">]</span> <span class="o">=</span> <span class="p">{</span><span class="mh">0xFF</span><span class="p">};</span>
<span class="n">flash_range_program</span><span class="p">(</span><span class="n">PROP_OFFSET</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">FLASH_PAGE_SIZE</span><span class="p">);</span>
</code></pre></div></div>
<p>Note in the last line you send the address of <code class="language-plaintext highlighter-rouge">PROP_OFFSET</code>, not
<code class="language-plaintext highlighter-rouge">PROP_ADDRESS</code> to write to memory. I spent way too much time figuring
this out. It makes sense though. A CPU will map various chunks of
memory, be them ROM, RAM, etc, to certain offsets. But the chips that
actually hold values are only internally aware of their local
addresses. Internally everything will map starting at address 0, and
externally the CPU will decide to route (for example) all addresses
from <code class="language-plaintext highlighter-rouge">0x02000000</code> - <code class="language-plaintext highlighter-rouge">0x02010000</code> to that memory bank. Trying to write
to the absolute address caused my program to crash hard, essentially
segfaulting, which made difficult.</p>
<h3 id="flash-programming-is-blow-fuse-style">Flash programming is <strong>blow-fuse</strong> style</h3>
<p>In general the Pi Pico has great documentation and example code. The
best around. I would have saved myself a lot of time on the previous
problem if I would have run the example code earlier. But here I ran
in to a problem that was a little too subtle to show up in the docs I
read.</p>
<p>If you’ve ever worked on a device with OTP (One-Time Programmable)
Memory, you know that the name isn’t entirely accurate. All the bytes
are filled with 0xFF and to change any bit to zero you <strong>blow the
fuse</strong>. Once that’s done you can never set the fuse back to the 1
state; you’re stuck at 0 for eternity. You can take advantage of this
feature to do some neat tricks. For example you can reserve a bank of
memory for later use as long as you don’t write any values
initially. It will just stay full of bytes of <code class="language-plaintext highlighter-rouge">0xff</code>. Then you can
have some marker in the main program that checks to see if that memory
has been burned later. If it’s <code class="language-plaintext highlighter-rouge">0xFF</code> run the normal program. If you
write an upgrade, and blow the first bit so that it reads <code class="language-plaintext highlighter-rouge">0xFE</code>, then
run the program starting at memory address <code class="language-plaintext highlighter-rouge">0x2000</code>, Blow the next bit
so that it reads <code class="language-plaintext highlighter-rouge">0xFC</code> then run the program at <code class="language-plaintext highlighter-rouge">0x4000</code>, etc. Neat
trick. The only problem is you can’t roll back to running the code at
<code class="language-plaintext highlighter-rouge">0x2000</code> if there’s a problem with the new code.</p>
<p>Although not OTP, the flash memory on the Pico works the same
way. This means you <strong>must</strong> run an erase operation before
reprogramming a chunk of memory. If you don’t you end up munging
numbers together. Most obviously, if you previously wrote <code class="language-plaintext highlighter-rouge">0x0</code> you
will never be able to write anything to that memory address. More
confusing if you don’t understand fuse blowing: If you have written
<code class="language-plaintext highlighter-rouge">0xAA</code> (binary <code class="language-plaintext highlighter-rouge">10101010</code>) and then try to write <code class="language-plaintext highlighter-rouge">0xF0</code>, you’ll end up
with contents of <code class="language-plaintext highlighter-rouge">0xA0</code> as the various fuses are blown.</p>
<p>Moral of the story, always erase memory before re-programming on a Pi Pico.</p>
<h2 id="python-testing-framework">Python testing framework</h2>
<p>Now my control interface is getting sophisticated and I can do a
lot. But I’m still annoyingly typing commands in to a terminal session
with basic capabilities, it doesn’t like special keys, I can’t
up-arrow to run the last command, etc. This still isn’t quite where I
want to be to develop quickly. I need a higher level interface.</p>
<p>I’m able to <a href="https://github.com/grant-olson/can-crusher/blob/main/firmware/scripts/serial_cli.py">whip one up in python</a>. I can take advantage of some of
python’s advanced hooks so I don’t need to update my code every time I add
a command, and get to the point where I can run the can crusher
through some sophisticated programs.</p>
<p>Now I can easily test how fast I can safely go up and down:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="kn">from</span> <span class="nn">serial_cli</span> <span class="kn">import</span> <span class="o">*</span>
<span class="n">cli</span> <span class="o">=</span> <span class="n">SerialCLI</span><span class="p">(</span><span class="s">"/dev/ttyUSB0"</span><span class="p">)</span>
<span class="n">cli</span><span class="p">.</span><span class="n">home</span><span class="p">()</span>
<span class="n">start_position</span> <span class="o">=</span> <span class="n">cli</span><span class="p">.</span><span class="n">position</span><span class="p">()</span>
<span class="c1"># See where our speed maxes out. incrementing speed
# by 10 mm per second each run.
</span><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span><span class="mi">100</span><span class="p">,</span><span class="mi">10</span><span class="p">):</span>
<span class="k">try</span><span class="p">:</span>
<span class="c1"># Jog up and down
</span> <span class="n">cli</span><span class="p">.</span><span class="n">move</span><span class="p">(</span><span class="mi">50</span><span class="p">,</span> <span class="n">i</span><span class="p">)</span>
<span class="n">cli</span><span class="p">.</span><span class="n">move</span><span class="p">(</span><span class="o">-</span><span class="mi">50</span><span class="p">,</span> <span class="n">i</span><span class="p">)</span>
<span class="k">except</span> <span class="n">SerialException</span> <span class="k">as</span> <span class="n">ex</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Failed at speed %d with error %s"</span> <span class="o">%</span> <span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">ex</span><span class="p">))</span>
<span class="n">current_position</span> <span class="o">=</span> <span class="n">cli</span><span class="p">.</span><span class="n">position</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Moving to start position at safe speed."</span><span class="p">)</span>
<span class="n">mm_to_start</span> <span class="o">=</span> <span class="n">start_position</span> <span class="o">-</span> <span class="n">current_position</span>
<span class="n">cli</span><span class="p">.</span><span class="n">move</span><span class="p">(</span><span class="n">mm_to_start</span><span class="p">,</span> <span class="mi">10</span><span class="p">)</span>
</code></pre></div></div>
<h2 id="real-world-example---better-tuning-of-crash-detection">Real World Example - Better tuning of crash detection</h2>
<p>Let’s return to a problem that was difficult when I was exercising the
stepper motor drivers a few weeks ago. I chose these stepper drivers
because they have stall-detection. This allows me to detect when the
crusher platform hits either the top or bottom of the structure, as
well as when it hits a can. This is done by setting a value for the
current drawn per step. If our current draw drops below that value
then the system decides the motor can’t advance.</p>
<p>Unfortunately the actual value is a bit of a magical number. It’s a
current draw value, but because these drivers can be used with a
variety of motors with different specifications that are used for a
variety of purposes, there is no one-size-fits-all way to determine
what the value should be. The datasheet isn’t able to include any
formulas, for example <em>to have torque threshold of X, use formula
Y</em>. The correct values must be obtained experimentally and tuned for
your particular application.</p>
<p>Early on when I was writing unabstracted C without a control language
this was extremely time consuming and frustrating. I would need to:</p>
<ul>
<li>Set a test value.</li>
<li>Recompile.</li>
<li>Reset the Pico and deploy code.</li>
<li>Have some sort of test action that hits the limit of motion.</li>
<li>Hope the motors don’t keep running forever when they encounter resistance.</li>
</ul>
<p>I did manage to find a value for movement that worked but surely
wasn’t ideal. I also ran in to problems because the value seems to
change as we change the speed, so really I will need to come up with
some sort of function to calculate something along the lines of <em>at
speed X mm per second, use value Y</em>. Additionally, if this was a real
product, we would probably want some sort of field calibration in case
the unit gets knocked around or performance changes with age.</p>
<p>I was able to use python to write a much better system to tune the
numbers than throwing virtual darts in code. It works by:</p>
<ul>
<li>Setting the motors to a given speed.</li>
<li>Maxing out the threshold so the motor will stall.</li>
<li>Trying to move the platform up and down 10 mm.</li>
<li>Lowering the threshold value until the platform can complete the range of motion.</li>
</ul>
<p>This is much, much better, but its still slow. It can run up to 256
times while narrowing in on the value. So I added code to find
approximate ranges and narrow in on them. First in groups of 64, then
16, then 1, to speed things up. For example:</p>
<ul>
<li>Test 255, stall. Test 191, stall. Test 127, PASS.</li>
<li>Test 192, stall. Test 176, stall. Test 160, PASS.</li>
<li>Test 176, stall. Test 175, stall. Test 174, stall. Test 173, stall. Test 172 PASS.</li>
</ul>
<p>I can run the test 5 times, come up with an average, and add in a
little bit of a leeway (say 5%) and use that as the value. On top of
it I can run the code over a series of speeds quickly, say from 5 mm
per second to 30, in intervals of 5 mm per second.</p>
<p>Now I <em>could</em> write all of this in C, but it would be time
consuming. In fact I might really want it in C later for field
calibration. But at this point I’m still not sure that this is the
ideal algorithm and what other problems I’ll encounter. It’s really
nice to test quickly, lock down the procedure, then either leave as is
or re-write in C.</p>
<p>I’m able to write a quick python script to do this in less than 15
minutes and less than 100 lines of code:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#!/usr/bin/env python
</span>
<span class="kn">from</span> <span class="nn">serial_cli</span> <span class="kn">import</span> <span class="o">*</span>
<span class="kn">import</span> <span class="nn">statistics</span>
<span class="kn">from</span> <span class="nn">time</span> <span class="kn">import</span> <span class="n">sleep</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="n">cli</span> <span class="o">=</span> <span class="n">SerialCLI</span><span class="p">(</span><span class="s">"/dev/ttyUSB0"</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">retry_wake</span><span class="p">():</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">cli</span><span class="p">.</span><span class="n">wake</span><span class="p">()</span>
<span class="k">except</span> <span class="n">SerialException</span><span class="p">:</span>
<span class="n">cli</span><span class="p">.</span><span class="n">wake</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">narrow_sg_range</span><span class="p">(</span><span class="n">bad</span><span class="p">,</span> <span class="n">good</span><span class="p">,</span> <span class="n">step</span><span class="p">,</span> <span class="n">speed</span><span class="p">):</span>
<span class="n">sys</span><span class="p">.</span><span class="n">stdout</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="s">"Trying "</span><span class="p">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">bad</span><span class="p">,</span> <span class="n">good</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">step</span><span class="p">):</span>
<span class="n">cli</span><span class="p">.</span><span class="n">set_prop</span><span class="p">(</span><span class="s">"STALLGUARD_THRESHOLD"</span><span class="p">,</span> <span class="n">i</span><span class="p">)</span>
<span class="n">cli</span><span class="p">.</span><span class="n">sleep</span><span class="p">()</span>
<span class="n">sleep</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="n">sys</span><span class="p">.</span><span class="n">stdout</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="s">"%i. "</span> <span class="o">%</span> <span class="n">i</span><span class="p">)</span>
<span class="n">sys</span><span class="p">.</span><span class="n">stdout</span><span class="p">.</span><span class="n">flush</span><span class="p">()</span>
<span class="n">retry_wake</span><span class="p">()</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">cli</span><span class="p">.</span><span class="n">move</span><span class="p">(</span><span class="o">-</span><span class="mi">10</span><span class="p">,</span> <span class="n">speed</span><span class="p">)</span>
<span class="n">cli</span><span class="p">.</span><span class="n">move</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="n">speed</span><span class="p">)</span>
<span class="k">print</span><span class="p">()</span>
<span class="k">return</span> <span class="p">(</span><span class="n">i</span> <span class="o">-</span> <span class="n">step</span><span class="p">,</span> <span class="n">i</span><span class="p">)</span>
<span class="k">except</span> <span class="n">SerialException</span> <span class="k">as</span> <span class="n">ex</span><span class="p">:</span>
<span class="k">pass</span> <span class="c1"># Later check for stall
</span> <span class="k">print</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Failed to find inflection point!"</span><span class="p">)</span>
<span class="k">return</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">find_range_once</span><span class="p">(</span><span class="n">speed</span><span class="p">):</span>
<span class="n">bad</span><span class="p">,</span> <span class="n">good</span> <span class="o">=</span> <span class="mi">255</span> <span class="p">,</span> <span class="mi">0</span>
<span class="n">bad</span><span class="p">,</span> <span class="n">good</span> <span class="o">=</span> <span class="n">narrow_sg_range</span><span class="p">(</span><span class="n">bad</span><span class="p">,</span> <span class="n">good</span><span class="p">,</span> <span class="o">-</span><span class="mi">64</span><span class="p">,</span> <span class="n">speed</span><span class="p">)</span>
<span class="k">if</span> <span class="n">bad</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span> <span class="k">return</span> <span class="mi">0</span>
<span class="n">bad</span><span class="p">,</span> <span class="n">good</span> <span class="o">=</span> <span class="n">narrow_sg_range</span><span class="p">(</span><span class="n">bad</span><span class="p">,</span> <span class="n">good</span><span class="p">,</span> <span class="o">-</span><span class="mi">16</span><span class="p">,</span> <span class="n">speed</span><span class="p">)</span>
<span class="k">if</span> <span class="n">bad</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span> <span class="k">return</span> <span class="mi">0</span>
<span class="c1"># bad, good = narrow_sg_range(bad, good,-8, speed)
# if bad == 0: return 0
</span> <span class="n">last_good</span> <span class="o">=</span> <span class="n">good</span>
<span class="n">bad</span><span class="p">,</span> <span class="n">good</span> <span class="o">=</span> <span class="n">narrow_sg_range</span><span class="p">(</span><span class="n">bad</span><span class="p">,</span> <span class="n">good</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">speed</span><span class="p">)</span>
<span class="c1"># If we didn't get it on 1 we might be on the very edge of stall
</span> <span class="c1"># detection. Try again along that range.
</span>
<span class="k">if</span> <span class="n">bad</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="n">bad</span><span class="p">,</span> <span class="n">good</span> <span class="o">=</span> <span class="n">narrow_sg_range</span><span class="p">(</span><span class="n">last_good</span><span class="o">+</span><span class="mi">3</span><span class="p">,</span> <span class="n">last_good</span><span class="o">-</span><span class="mi">3</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">speed</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"SPEED: %d BAD %d, GOOD %d"</span> <span class="o">%</span> <span class="p">(</span><span class="n">speed</span><span class="p">,</span> <span class="n">bad</span><span class="p">,</span> <span class="n">good</span><span class="p">))</span>
<span class="k">return</span> <span class="n">good</span>
<span class="k">def</span> <span class="nf">find_range</span><span class="p">(</span><span class="n">speed</span><span class="p">):</span>
<span class="n">results</span> <span class="o">=</span> <span class="p">[</span><span class="n">find_range_once</span><span class="p">(</span><span class="n">speed</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">5</span><span class="p">)]</span>
<span class="k">print</span><span class="p">(</span><span class="s">"RAW RESULTS: %s"</span> <span class="o">%</span> <span class="nb">repr</span><span class="p">(</span><span class="n">results</span><span class="p">))</span>
<span class="n">results</span> <span class="o">=</span> <span class="p">[</span><span class="n">x</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">results</span> <span class="k">if</span> <span class="n">x</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">]</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">results</span><span class="p">)</span> <span class="o"><</span> <span class="mi">3</span><span class="p">:</span>
<span class="k">raise</span> <span class="nb">RuntimeError</span><span class="p">(</span><span class="s">"BAD DATA POINTS!"</span><span class="p">)</span>
<span class="n">average</span> <span class="o">=</span> <span class="n">statistics</span><span class="p">.</span><span class="n">mean</span><span class="p">(</span><span class="n">results</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"AVERAGE: %f"</span> <span class="o">%</span> <span class="n">average</span><span class="p">)</span>
<span class="n">safe_average</span> <span class="o">=</span> <span class="n">average</span> <span class="o">*</span> <span class="mf">0.95</span> <span class="c1"># give an extra 5%
</span> <span class="n">safe_average</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">safe_average</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"FINAL: %d"</span> <span class="o">%</span> <span class="n">safe_average</span><span class="p">)</span>
<span class="n">values</span> <span class="o">=</span> <span class="p">[]</span>
<span class="c1"># 5 - 84
# 7 - 109
# 10 - 132
# 12 - 144
# 15 - 157
# 17 - 158
# 20 - 171
# 22 - 151
# 25 - 135 ?
# 30 - ???
</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">28</span><span class="p">,</span><span class="mi">36</span><span class="p">,</span><span class="mi">5</span><span class="p">):</span>
<span class="n">cli</span><span class="p">.</span><span class="n">set_prop</span><span class="p">(</span><span class="s">"STALLGUARD_THRESHOLD"</span><span class="p">,</span> <span class="mi">171</span><span class="p">)</span>
<span class="n">cli</span><span class="p">.</span><span class="n">sleep</span><span class="p">()</span>
<span class="n">cli</span><span class="p">.</span><span class="n">wake</span><span class="p">()</span>
<span class="n">cli</span><span class="p">.</span><span class="n">move</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span><span class="mi">20</span><span class="p">)</span>
<span class="n">cli</span><span class="p">.</span><span class="n">move</span><span class="p">(</span><span class="o">-</span><span class="mi">10</span><span class="p">,</span><span class="mi">20</span><span class="p">)</span>
<span class="n">res</span> <span class="o">=</span> <span class="n">find_range</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
<span class="n">values</span><span class="p">.</span><span class="n">append</span><span class="p">(</span> <span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="n">res</span><span class="p">)</span> <span class="p">)</span>
<span class="n">ten_speed</span> <span class="o">=</span> <span class="mi">131</span> <span class="c1">#values[0][1]
</span> <span class="n">cli</span><span class="p">.</span><span class="n">set_prop</span><span class="p">(</span><span class="s">"STALLGUARD_THRESHOLD"</span><span class="p">,</span> <span class="n">ten_speed</span><span class="p">)</span>
<span class="n">cli</span><span class="p">.</span><span class="n">sleep</span><span class="p">()</span>
<span class="n">cli</span><span class="p">.</span><span class="n">wake</span><span class="p">()</span>
<span class="n">cli</span><span class="p">.</span><span class="n">home</span><span class="p">()</span>
<span class="n">sleep</span><span class="p">(</span><span class="mf">1.0</span><span class="p">)</span>
<span class="n">cli</span><span class="p">.</span><span class="n">move</span><span class="p">(</span><span class="mi">50</span><span class="p">,</span> <span class="mi">10</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="nb">repr</span><span class="p">(</span><span class="n">values</span><span class="p">))</span>
</code></pre></div></div>
<h2 id="uncovered-problems">Uncovered problems</h2>
<p>Now that my testing is much more systematic and less ad-hoc I identify
a few problems:</p>
<ol>
<li>
<p>The stall value seems to change as I get lower and lower on the
platform. This makes me think that the threaded rods aren’t
properly aligned and are at slight angles that I can’t see. I’ll
need to investigate and redesign the holders.</p>
</li>
<li>
<p>We can’t travel nearly as fast as I expect. I suspect it’s because
at higher speeds we need to accelerate, and my current algorithm is either
on-at-full-speed or off-at-zero. The TMC2209 datasheet does indeed
indicate that to move swiftly you need some acceleration algorithm,
and this is up to you to write.</p>
</li>
</ol>
<h1 id="pio-stepper-control">PIO Stepper Control</h1>
<p>Since I’m in software mode and still have a few days set aside in my
make-believe sprint, I move on to another feature I wanted to get
working on the Pico. One of the major reasons I wanted to use a Pi
Pico was to get an opportunity to play with the Programmable I/O
(PIO).</p>
<h2 id="general-high-level-pio-justifications">General High Level PIO Justifications</h2>
<p>The RP2040 has two dedicated sub-processors that are optimized for
dealing with input and output. They have a very small footprint,
memory, and set of assembly instructions, and are very
specialized. But the advantage is that they run completely
independently of the main CPU, and each instruction takes exactly 1
clock cycle, so the execution time is extremely fast and predictable.</p>
<p>That’s the high level explanation that is given by the Pi
Foundation. After working through the datasheet explanation and SDK
examples, it becomes apparent that these processors are extremely
optimized to turn bytes in traditional memory in to signals on GPIO
lines, and vice-versa. I think that’s the best way to think about how
to take advantage of them. How do I turn bytes in to signals, and 1 or
2 signal lines in to bytes?</p>
<p>One SDK example is <a href="https://github.com/raspberrypi/pico-examples/tree/master/pio/uart_tx">UART control</a>. I think this is a really good one. If
you’ve played around with GPIO pins you’ve likely played around with
few standard interfaces like SPI or I2C that are easy to
‘bit-bash’. They don’t have really tight timing requirements and since
you control the clock you can just flip things up and down to make
things happen:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>def send_bit(bit):
bit_pin.set(bit)
clock_pin.set(1) # force it high
clock_pin.set(0) # force it low.
</code></pre></div></div>
<p>But UART is actually extremely timing sensitive. The first clock
signal of a byte tells you what the clock frequency is, and you need
to be there to pick up the data on that exact timing. Similarly you
need to send data with very exact timing, which is difficult to do
even in a low level language like C.</p>
<p>Another SDK example is <a href="https://github.com/raspberrypi/pico-examples/tree/master/pio/ws2812">WS2812 LED light strips</a>. These are connected
in serial and you need to send an extremely specific set of highs and
lows to to set possibly hundreds of lights to the correct color. The
exact algorithm is:</p>
<ul>
<li>Send high signal followed by low.</li>
<li>To send 0, go high for 0.35 uSec and low for 0.8 uSec.</li>
<li>To send 1, go high for 0.7 uSec and low for 0.6 uSec.</li>
<li>All signals expected to have +/- 150ns accuracy.</li>
<li>Repeat 100s or thousands of times to set whole light strand.</li>
</ul>
<p>We’re certainly not easily bit-bashing that! I’ve actually tried to do
this for a single WS2812 LED on an under-powered 16 Mhz processor
using <code class="language-plaintext highlighter-rouge">NOP</code> commands to get the timing exactly right, and it was just
plain impossible to get accurate timing. But since you can set and
independent frequency for your PIO controller, and calculate the exact
time it takes for each instruction to execute, since it’s one clock
cycle, it’s really easy to get that timing dialed in.</p>
<p>But that’s enough with the SDK examples.</p>
<h2 id="my-pio-based-stepper-clock-signal">My PIO based stepper clock signal</h2>
<p>What I want to do is drive a square wave generator to spin the stepper
motors. Depending on both the speed in mm per second I want to go, and
the mm per step, I can calculate an exact clock frequency. I can also
immediately detect a stall because we have an instruction <code class="language-plaintext highlighter-rouge">JMP PIN</code> that will
immediately respond to a pin going high in the code.</p>
<p>I have 4 registers to work with:</p>
<ul>
<li>OSR - Output shift register - Send data from normal memory to PIO.</li>
<li>ISR - Input shift register - Send data from PIO to normal memory.</li>
<li>X - scratch register</li>
<li>Y - scratch register</li>
</ul>
<p>This isn’t much but it’ll do.</p>
<p>I can send two bytes to the PIO:</p>
<ul>
<li>Number of steps.</li>
<li>Number of clock cycles to wait to achieve the proper frequency.</li>
</ul>
<p>This is a little different than the PIO wants. Remember I said it’s
optimized to turn bytes directly in to GPIO, and GPIO directly in to
bytes. Here I’m sending intermediate values. But I am able to work
within the confines of the minimal provided assembly language to get
what I want.</p>
<p>Things are also a little complicated because I didn’t think to put the
step pins from the left and right motors next to each other. The PIO
can deal with up to 4 pins, but wants them to be sequential. Luckily
the language includes a ‘side pin’ feature for cases like this.</p>
<p>I also have a different pin for stall detection, but the ‘jump pin’ is
also treated as a different bank of pins. It is a problem that I can
only test one pin so I’ll need to fix that in hardware with an OR gate
later, so either the left or right motor stalling will abort the
code. For now I’ll just pick one.</p>
<p>The basic algorithm:</p>
<ul>
<li>C program pushes number of steps and pre-calculated timing.</li>
<li>PIO waits until it receives both.</li>
<li>PIO goes in to a loop making a square wave, waiting pre-calculated
number of instructions after both setting pin High and then low.</li>
<li>PIO does another loop consuming number of steps.</li>
<li>If the stall detection crashes we exit both loops.</li>
<li>PIO sends the remaining number of steps (-1 if done, X if stalled)
so the C program knows how far we actually moved, and if we completed
the requested movement safely.</li>
</ul>
<p>Here’s a quick listing of the code. You’ll need some familiarity with
assembler to follow along. Here are some specific PIO instructions to
help you along:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">pull block</code> grabs data that the main program put in to the OSR,
waiting for the data.</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">out y, 32</code> copies 32 bits from the OSR to the scratch y register.</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">pull noblock</code> an important hack. Try to grab data for the OSR, but
if it’s not there use whatever is in the X register. This
effectively allows me to save the X register for reuse later.</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">set pins ...</code> update GPIO pins with values, optionally using the
‘side pins’ I needed due to my pin assignment.</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">jmp x-- lp1</code> Decrement the register, jump UNLESS register was 0
then fall through.</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">jmp pin lp0</code> Jump only if the jump pin has transitioned from low to
high, else fall through.</p>
</li>
</ul>
<pre><code class="language-asm">;
; Drive stepper motors with PIO so the clocks are consistent and on time.
;
; Push in a number or steps to take, and the number of cycles to burn
; to get the correct frequency, then pull out the number of remaining steps
; so we can see if we stalled.
.program step_both
.side_set 1 opt
.wrap_target
pull block ; Get number of steps
out y, 32
pull block ; Get clock cycles to burn to obtain correct frequency
lp0:
out x, 32 ; save to x
pull noblock ; copy x back in to OSR to use each loop
set pins, 1 side 0x1 ; Clock ON
lp1:
jmp x-- lp1 ; Delay for (x + 1) cycles, x is a 32 bit number
out x, 32 ; grab saved copy of burns
pull noblock ; copy x back to osr
set pins, 0 side 0x0 ; Clock OFF
lp2:
jmp x-- lp2 ; Delay for the same number of cycles again
jmp pin lp3 ; Abort if we report stall
jmp y-- lp0 ; count as one full cycle
lp3:
mov isr, y ; Move remaining cycles in to isr
push block ; Send off to main program
.wrap ; Wait for next set of instructions
</code></pre>
<p>Then all we need to kick things off in C:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">uint32_t</span> <span class="nf">step_clocks_for_frequency</span><span class="p">(</span><span class="n">uint</span> <span class="n">frequency</span><span class="p">)</span> <span class="p">{</span>
<span class="kt">uint32_t</span> <span class="n">clocks</span> <span class="o">=</span> <span class="p">(</span><span class="n">clock_get_hz</span><span class="p">(</span><span class="n">clk_sys</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="mi">2</span> <span class="o">*</span> <span class="n">frequency</span><span class="p">))</span> <span class="o">-</span> <span class="mi">11</span><span class="p">;</span> <span class="c1">// 11 to account for control clock cycles</span>
<span class="k">return</span> <span class="n">clocks</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">step_x_times</span><span class="p">(</span><span class="n">PIO</span> <span class="n">pio</span><span class="p">,</span> <span class="kt">int</span> <span class="n">sm</span><span class="p">,</span> <span class="n">uint</span> <span class="n">steps</span><span class="p">,</span> <span class="n">uint</span> <span class="n">frequency</span><span class="p">)</span> <span class="p">{</span>
<span class="n">pio</span><span class="o">-></span><span class="n">txf</span><span class="p">[</span><span class="n">sm</span><span class="p">]</span> <span class="o">=</span> <span class="n">steps</span><span class="p">;</span>
<span class="n">pio</span><span class="o">-></span><span class="n">txf</span><span class="p">[</span><span class="n">sm</span><span class="p">]</span> <span class="o">=</span> <span class="n">step_clocks_for_frequency</span><span class="p">(</span><span class="n">frequency</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>And to get the result to determine if we stalled:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int32_t</span> <span class="n">remaining_ticks</span> <span class="o">=</span> <span class="n">pio_sm_get_blocking</span><span class="p">(</span><span class="n">pio</span><span class="p">,</span> <span class="n">sm</span><span class="p">);</span>
</code></pre></div></div>
<p>Now we have an extremely well timed square wave that will look perfect
on an oscilloscope! Better than anything we could just do directly in
C. For now we block waiting for results. In the future we can run some
minimal housekeeping code in the foreground while waiting for results.</p>
<h1 id="next-steps">Next Steps</h1>
<p>Now I’m finally at the point where I can easily explore the can
crusher as a full unit. I can test the entire system quickly and see
what’s going on. As expected there are several problems with the
initial design. There always are. The biggest problems are:</p>
<ul>
<li>
<p>There seem to be alignment problems as the motors stall more easily
as the platform gets closer to the steppers.</p>
</li>
<li>
<p>We’re not getting nearly enough power to crush cans easily.</p>
</li>
<li>
<p>When the system is stressed the motors can get out of sync, and move
each side of the crushing platform far enough apart that things seize
and I can’t manually reset the plate without disassembly.</p>
</li>
</ul>
<p>It seems like all of the problems are in the mechanical design. I’ll
focus on upgrades with that next.</p>
<hr />
<p><em>All cad files, electronic files, source code, etc, referenced in this
post series is available on <a href="https://github.com/grant-olson/can-crusher">my github
page</a>.</em></p>
Can Crusher Part 3 - Development PCBA2022-10-18T00:00:00+00:00http://www.grant-olson.net/news/2022/10/18/can-crusher-3<p>At this point I’ve built out the frame for my can crusher, and I wrote
a proof of concept stepper motor control program with a Raspberry Pi
Pico and TMC2209 stepper driver boards. But the board was getting very
delicate and fragile and didn’t have all the features to make it easy
to develop better software and do testing. I decided it was time to do
round one of a proper PCBA to make it easier to move forward.</p>
<p>If I was actually building this for a company, there would be a good
chance the hardware person would pass off the development to a
firmware person, mechanical people would want test units, etc. A PCBA
makes that a lot easier for all parties involved. In this case it’s
just me, but it still makes my life much easier.</p>
<p>I’ll use KiCad for all the PCBA work.</p>
<h1 id="basic-design">Basic Design</h1>
<p>I want a PCBA that could fit either in the base or top of the
machine. That limits the size. In addition to that I want it to:</p>
<ul>
<li>Have a plug for the 12 Volt power supply that works with
a normal power adapter instead of a bench supply.</li>
<li>Have an easy way to reboot. I’m currently unplugging and plugging
the USB cable to reboot and that is annoying.</li>
<li>Optional Pico power supply pins to run without USB cable.</li>
<li>Explicit enable of 12 volt power to the stepper subystem.</li>
<li>Provide a UART control connection.</li>
<li>Provide a very simple user interface with feedback.</li>
<li>Expose any unused GPIO pins for additional features, such as a
can sensor.</li>
<li>Mostly use through hole parts.</li>
</ul>
<h2 id="soft-reboot">Soft Reboot</h2>
<p>This is easy enough. You just need to pull the <code class="language-plaintext highlighter-rouge">RUN</code> pin to ground. I
added a button that can be pressed to reboot.</p>
<h2 id="12-volt-enable">12 Volt enable</h2>
<p>The stepper controllers and motors draw a lot of current and can
generate a lot of heat. I don’t want them doing that continuously if I
accidentally leave the crusher plugged in while going away for the
weekend.</p>
<p>I’ll build out a MOSFET switch that defaults to keeping the 12 volt
supply OFF. It can only be explicitly enabled when a properly running
program on the Pico does so. Rebooting, deploying bad code, sitting in
bootloader mode, etc, should default to turning the power off.</p>
<h2 id="uart-control">UART Control</h2>
<p>I’m testing a lot of different things at this point. I want to test
the hardware, software, and the actual functionality of crushing a
can. For the latter I thought it would be nice to have a mini language
where send over a UART connection with things like <code class="language-plaintext highlighter-rouge">UP 20mm</code> and
<code class="language-plaintext highlighter-rouge">HOME</code>. It can also return status updates like the current position of
the crushing platform, and if the motors have detected stalls. This
will let me test various crushing algorithms without having to
recompile code.</p>
<p>It also provides the opportunity to provide an advanced control
computer later that can use the same control port to interact with
that system. I could have a Single Board Computer with a fancy
display, touch screen, bluetooth access, etc. Separating those
functions out from the Real-Time functions of controlling the motors
is also good system design.</p>
<h2 id="primitive-ui">Primitive UI</h2>
<p>Once again I want help testing things out without having to either:</p>
<ol>
<li>Recompile.</li>
<li>Or read log entries to get feedback.</li>
</ol>
<p>I’ve added a very simple UI consisting of an RGB led and a single
button. I can try to crush a can when the button is pressed. It can
show a green light if things are good and red if the motors have
stalled. Pressing the button again could cause the platform to lift
itself back to the original position.</p>
<p>Nothing too sophisticated, and I might not even use it, but might
as well throw it on since it’s cheap and I have GPIO pins to spare.</p>
<h2 id="expansion-port">Expansion Port</h2>
<p>Here I’ll just add a header with access to every unused GPIO pin,
ground, and 3.3 volt power supply. This will allow me to add one or
two more devices without having to make a new board. I still haven’t
decided how the machine knows a can has been inserted.</p>
<h2 id="through-hole-parts">Through hole parts</h2>
<p>Lets keep things easy to assemble, and more importantly make it easier
to modify a board if needed. I can handle the bigger SMT parts just
fine on assembly, but often the first draft of a board will have one
or two minor problems. Nice big holes on the board will make it easier
to jury rig fixes.</p>
<h2 id="basic-non-design">Basic non-design</h2>
<p>For now this is simple. In a final product I’ll want:</p>
<ol>
<li>
<p>A buck converter to take 12 Volts to 4-5 volts to power the
Pico directly. But I don’t know what chip I want to use so
I’ll hold off.</p>
</li>
<li>
<p>A reverse polarity protection diode. This protects the board
With a mis-wired power supply.</p>
</li>
<li>
<p>Capacitors, capacitors, capacitors? We should problably have a
decoupling capacitor on the power supply. I’ve noticed 3d printer
boards that take these stepper boards all have pretty big
decpoupling capacitors as well. I need to review
datasheets, do math, look at best practices etc, to pick the right
ones.</p>
</li>
<li>
<p>A hard power switch.</p>
</li>
</ol>
<p>I’ll deal with all that later.</p>
<h1 id="from-idea-to-pcb">From Idea to PCB</h1>
<p>Getting a physical PCB made is surprisingly easy and affordable. There
are manufacturers that get as low as $2-3 dollars for multiple copies
of a simple circuit board. The shipping often costs more than the
actual boards if you’re trying to get them quickly. If you’re a
hobbyist and can wait a few weeks, there’s always the slow boat from
China Post.</p>
<p>To get a board I need to:</p>
<ol>
<li>Design a schematic of the circuits you want.</li>
<li>Export the netlist from the schematic in to the PCB editor.</li>
<li>Lay out the parts in the PCB editor.</li>
<li>Export ‘gerber’ files which tell machines how to do the board layout.</li>
<li>Upload the gerber files to a manufacturers site, get a quote, pay.</li>
<li>Wait for delivery.</li>
</ol>
<h1 id="the-schematic">The Schematic</h1>
<p>I’ve whipped together some quick schematics before to get some simple
boards built. Now that I’ve been professionally working with hardware
for a bit, I’ve come to appreciate that the schematic isn’t just a
prerequisite for PCB generation. It is the primary source of
documentation of the hardware. If there’s an equivalent to source code
for PCBs, as far as I’m concerned that’s the schematic.</p>
<blockquote>
<p>“Programs must be written for people to read, and only incidentally
for machines to execute.” <strong>Harold Abelson</strong></p>
</blockquote>
<p>And just like software, I’ve seen schematics that the author believed
were <em>self-documenting</em> and <em>intuitive</em> when in actuality it wasn’t
easy to understand the logic, what a sub-component was trying to
accomplish, and why some little tricks were performed.</p>
<p>Although I feel this is a very simple board I went out of my way to
try to focus on making the layout of the schematic clean and neat. I
also used plenty of <em>gulp</em> words to explain functionality, even though
that’s ultimately totally irrelevant to laying out the PCB.</p>
<p>In general things went smoothly with only a few complications.</p>
<h2 id="schematic-complication-1---missing-symbols-and-footprints">Schematic Complication 1 - Missing symbols and footprints.</h2>
<p>KiCad has a very large set of symbols for all sorts of resistors,
capacitors, transistors, and chips. But it of course can’t have every
part in existence. It was missing footprints for the Pi Pico (because
it’s so new) and the BigTreeTech TMC2209 board (since it is a custom
board). I needed to deal with that.</p>
<p>Luckily the Pi Pico is popular enough all the files I needed were
available on github. But is still wasn’t quite perfect so I had to fork the code. My version is <a href="https://github.com/grant-olson/KiCad-RP-Pico.git">here</a>.</p>
<p>There wasn’t even a starter for the BigTreeTech boards. I had
never built out a symbol and footprint from scratch before so I was a
little intimidated. Luckily it was pretty painless to do this.
KiCad was set up to make it very easy to deal with any sort of
chip or board that has a moderately normal configuration, such as the
2.54mm headers that the BigTreeTech board had.</p>
<h2 id="schematic-complication-2---12-volt-enable-circuit">Schematic Complication 2 - 12 Volt Enable Circuit</h2>
<p>I wanted a circuit to explicitly enable the 12 volt power that gets to
the stepper drivers and eventually the motors. As I mentioned above, I
wanted it to be a safe switch, one where undefined or unexpected
conditions caused it to not power the unit.</p>
<p>To complicate matters the stepper driver boards have two source of
power coming in:</p>
<ul>
<li>The variable higher-voltage high-current motor supply.</li>
<li>VDD for the logic levels.</li>
</ul>
<p>Because of this I didn’t want to put the switch in the normal position
(between the the boards and ground) because then there could be times
where the chip was getting 3.3 volt VDD but wasn’t properly
grounded. Instead I put the switch in front of the load (between +12
volts and the boards).</p>
<p>This meant using a P-Channel Mosfet. And since the Pico runs at 3.3
volts, which is a little too low to trigger a power mosfet, I needed
to add another N-Channel mosfet switch. The basic design is:</p>
<ol>
<li>The Pico sets a GPIO pin to desired state.</li>
<li>This activates a BS170 N-Channel mosfet which is happy with 3.3 volts.</li>
<li>This in turn opens up an open drain circuit, dropping the voltage
from 12 volts via a pull-up to ground.</li>
<li>This opens up the P-Channel mosfet and the full current flows
all the way to the stepper motors.</li>
</ol>
<p>To make sure I got things right I did a mini test of this circuit only
on a breadboard, two mosfets, one resistor, and a jumper wire.</p>
<h2 id="schematic-results">Schematic Results</h2>
<p>Here’s the final schematic. Keep in mind that once I was in the PCB
layout phase the process of PCB and schematic design was
iterative. I’d realize I needed to change a pin layout and would go
back to the schematic and edit and re-import to the PCB editor. This
schematic isn’t the rough draft, it’s the first draft. For example, I
inverted the <strong>Right Z-Axis Stepper Control</strong> schematic symbol to
accommodate a good PCB layout and to keep my schematic clean.</p>
<p><a href="/assets/img/can-crusher/can-crusher-schematic.pdf"><img src="/assets/img/can-crusher/schematic-rev-1.png" width="100%" /></a></p>
<h1 id="pcb-layout">PCB Layout</h1>
<h2 id="general-pcb-layout-goals">General PCB Layout Goals</h2>
<p>In general there are a few goals any time you lay out a circuit
board. In no particular order:</p>
<p>Minimizing PCB Layers. Traces connecting components can’t touch. They
can’t cross paths. Eventually you get painted in to a corner and can’t
connect two parts because there are traces in the way. The fix for
this is to move the trace to a different PCB layer. The most obvious
example is moving the trace from the top of the board to the
bottom. But eventually that bottom layer might get filled up and you
need to move to a 4 layer PCB design (or more) which increases
production cost and board complexity. Based on the simplicity of my
design I felt confident I could stick with a two layer board.</p>
<p>Organization of electronic components. If a datasheet for a chip
recommends a capacitor on the power line you’ll want that close to the
actual chip, not on the other side. If you have a <em>differential pair</em>
of traces, like the <strong>D+</strong> and <strong>D-</strong> signals of USB, should be next
to each other. There may also be requirements to make traces between
two components as short as possible. All these things need to be taken
in to consideration when laying out the board.</p>
<p>Organization of other components. The power plug should be on the edge
of the board. If you have light that shows that the unit is on, or a
USB connector to program, those parts will also need to be on the
edge.</p>
<h2 id="laying-out-the-can-crusher-board">Laying Out the Can Crusher Board</h2>
<p>When you start laying out the PCB it creates a <em>rats-nest</em> of
parts. The program just drops them on the board with little lines
indicating where parts need to be connected, but they’re a mess and
they overlap. It’s up to you to figure out where to go from there.</p>
<p>The first issue that became apparent was that the pins I used to
connect my Pi Pico to the TMC2209 boards on my breadboard were not
ideal for the PCB. There were way too many crossed traces. I suspected
this was going to be a problem on early drafts of the schematic, but
decided not to sort it out until I could visualize the layout. Once I
saw the parts and could position them on the board, it was easy to
reassign the pins on my schematic and get a much cleaner bard layout.</p>
<p>The second issue was that I’ve never run power through a PCB
before. KiCad has great defaults for microcontroller projects, but each
stepper motor I have can draw up to 1.5 amps. That’s up to 3 amps
total coming from the power supply. So what’s the problem?</p>
<p>A trace is just a flat wire. Just like wire, the size is only rated to
carry a certain amount of power. If you go over that limit for too
much for too long, the wire will heat up, burn up, or even melt! I
found a reference chart and it recommended 0.76mm traces for 2 amps
and 1.25mm traces for 3 amps. I increased the appropriate wire sizes.
Then I had to reroute because the thickening the existing wires caused
them to touch other wires and parts.</p>
<blockquote>
<p><strong>ProTip™:</strong> In the past I got very annoyed when I needed to
move a trace in the PCB editor. KiCad treats each line segment as
its own trace. I would always have to delete 5 line segments and
would always miss the very small sub-trace that made the final
connection to the component pad.</p>
<p>This popped up again when I was
increasing trace sizes; selecting a trace, right clicking
properties, etc, was annoying. After some google-fu I learned that I
could select a sub-trace and then hit the <strong>u</strong> key a few times to
expand the selection to include the rest of the traces in the
network.</p>
</blockquote>
<p>The last issue was that I wanted to take advantage of the silkscreen
to make the board self documenting. Explaining which pins on the power
header were positive and negative, explaining what GPIO pins we hooked
in to on the expansion header, etc. It turned out KiCad has a pretty
nice solution to this. If you choose <strong>Edit Footprint</strong> on an
individual component in the PCB Editor if modifies that part only. You
don’t need to edit the original footprint, or create a new official
footprint in your library, just to have the right labels on an eight
pin header.</p>
<p>In spite of all the issues it wasn’t too difficult to get things laid
out, pass the DRC checks, and get a final version of revision 1 of the
board. I was proud that I only had one trace that needed to jump from
one side of the board to the other to avoid hitting other traces.</p>
<p><a href="/assets/img/can-crusher/pcb-render.png"><img src="/assets/img/can-crusher/pcb-render.png" width="50%" style="float:right;" /></a></p>
<p><a href="/assets/img/can-crusher/pcb-layout-rev-1.png"><img src="/assets/img/can-crusher/pcb-layout-rev-1.png" width="50%" /></a></p>
<h1 id="ordering-the-pcb">Ordering the PCB</h1>
<p>I ran a simple export of files from KiCad and sent them off to the
vendor Wednesday afternoon EDT time. About 3-4 A.M. in China. They
were able to manufacture the boards on Thursday and Friday their time,
get things sent off to DHL, and somehow amazingly I received the
boards on Monday by noon. Less than a week turnaround. Five boards
total. Price of $3 for the boards and $19.05 was for the expedited
shipping. It’s a great day and age to be a maker!</p>
<p>The unpopulated boards:</p>
<p><a href="/assets/img/can-crusher/unpopulated-pcb.jpeg"><img src="/assets/img/can-crusher/unpopulated-pcb.jpeg" width="100%" /></a></p>
<h1 id="assembly">Assembly</h1>
<p>Assembly was straightforward. All the components were through
hole. The footprint for the Pi Pico was nice. It was set up to use
either headers, or solder directly to the board. For my test unit I
added headers so I can swap the boards. This also allowed me to test
the 12 Volt Power circuit one last time without inserting either the
Pico or the TMC2209 boards.</p>
<p><a href="/assets/img/can-crusher/populated-pcb.jpeg"><img src="/assets/img/can-crusher/populated-pcb.jpeg" width="50%" /></a></p>
<p>Once testing was done it was easy to plug in the final components and start using the board.</p>
<p><a href="/assets/img/can-crusher/populated-pcb-with-boards.jpeg"><img src="/assets/img/can-crusher/populated-pcb-with-boards.jpeg" width="50%" /></a></p>
<h1 id="next-steps">Next Steps</h1>
<p>The new board is working well and makes it much easier to push new
versions of the firmware. However, it’s still pretty slow when I’m
trying to exercise the range of motion, test sensorless homing, etc.
My hard-coded scripts of action are too basic and then when I hit the
base or the motors lose sync, I need to write a new hard-coded script
to fix it.</p>
<p>To make things easier I’ll focus on adding a mini control language
accessible via the UART port. Then I can test the actual movement
without having to hard code a sequence of events in code base. This
should make it easier for me to focus on algorithms for running the
motors, find the right sensitivity settings for sensorless stalling,
etc.</p>
<hr />
<p><em>All cad files, electronic files, source code, etc, referenced in this
post series is available on <a href="https://github.com/grant-olson/can-crusher">my github
page</a>.</em></p>
Can Crusher Part 2 - Stepper Drivers and Controls2022-10-09T00:00:00+00:00http://www.grant-olson.net/news/2022/10/09/can-crusher-2<p>Now that I’ve built a frame for my can crusher, the next big task is
to write out the drivers to control the stepper motors and test things
out. At this point everything will be on a breadboard. The goal is to
make sure I know how the driver boards work, and test out the
motors. It is not the production-level implementation yet.</p>
<h1 id="basic-design">Basic Design</h1>
<ul>
<li>Two independent stepper motors are on each side of the unit.</li>
<li>On boot the device should home by going down until the base of
the unit is identified by both stepper motors.</li>
<li>From there move up X mm to make space for a can.</li>
<li>When can is detected, lower and crush the can.</li>
<li>There will probably be additional logic when we first touch the
can and feel resistance.</li>
</ul>
<h1 id="motor-controller-boards">Motor Controller boards</h1>
<p>I decided to use some TMC2209 driver boards manufactured by
BigTreeTech. These are built for 3D printers and CNC machines and have
a configuration that allows you to plug them in to the existing
control boards for these machines. I’m going to use these driver
boards but create my main control board from scratch.</p>
<p>The TMC2209 was chosen because it has built in stall-detection, and
I’m hoping to get auto-homing, like you see on higher end 3D printers
like Prusas. I’ll detect the end of the range of motion when stall
detection kicks in.</p>
<p>The cheap solution is to install limit switches that are just some
bits of ribbon wire that gets physically pressed to complete a
circuit. These are okay, but I find them annoying because they usually
act as a proxy for the actual limit, aren’t so accurate, and can move
a little bit over time. The stall detection approach is much more
elegant.</p>
<p>I’m also anticipating a stall when the plate initially hits the can so
if that’s the case I’ll also be able to detect when we are held up on
a can and not actually hitting the base of the unit.</p>
<h1 id="initial-verification">Initial Verification</h1>
<p>I wanted to test out the boards before hooking up to my Pi Pico. The
simplest possible working configuration should be:</p>
<ul>
<li>A stepper motor is hooked up.</li>
<li>Power is applied to both the control portion and the motor portion,
probably at different voltages.</li>
<li>A variety of pins need to be set to either GND or VCC to enable
desired behavior.</li>
<li>A steady clock signal should be applied to the STEP pin and we should
see the motor move.</li>
</ul>
<p>This is a good example where accumulating gear over the years helps
out. In my earlier days I would have skipped this step and gone
straight to hooking up to my Pico (or Arduino, etc) and would bang my
head against the wall until things worked. I also would have done
something silly to deal with the fact that the motors have different
power requirements than the logic portion of the board, like
connecting a nine-volt battery. And then when things didn’t work I
would never know if the problem was on the software side or hardware
side. There would be too many variables to make debugging easy.</p>
<p>With a bench power supply and a signal generator I was able to create
an initial test rig that didn’t require any microprocessor or any
coding. I sent 5 volts to the VCC, 9 volts to the VM (Voltage Motor)
pin, and did some quick math to determine that I wanted my signal
generator to send out a square wave at 1600 Hz to rotate the motor at
one Revolution Per Second. (Test motor specs indicate 1.8 degrees
rotation per step, 200 steps for a full circle, and the TMC2209
defaults to 8 sub-steps, so 200 * 8 = 1600)</p>
<p>Annoyingly the BigTreeTech boards have a pin configuration that won’t
let you plug them directly in to a breadboard without shorting 3
pins. Luckily the pins in question are also accessible from top side
so I was able to cut off two pins on the bottom side:</p>
<p><a href="/assets/img/can-crusher/annoying-pinout.jpeg"><img src="/assets/img/can-crusher/annoying-pinout.jpeg" width="50%" /></a></p>
<p>Even with the simple test setup, I managed to fry a board (two if I’m
honest) before getting the proper configuration. One thing that
complicates these stepper boards is that they motor side of the chip
can want to draw 1 or 2 amps or power, at a higher voltage than the
logic side supports. A problem with the wiring can send way too much
voltage and current to the logic side, frying it. Normally I would
avoid this by keeping the current limits on my bench power supplies
low, but in this case high current is required to spin the motors.</p>
<p>In the end my test setup worked and the rotational speed looked
good to the eye.</p>
<p><a href="/assets/img/can-crusher/initial-tmc-verification.jpeg"><img src="/assets/img/can-crusher/initial-tmc-verification.jpeg" width="100%" /></a></p>
<h1 id="raw-uart-control">Raw UART Control</h1>
<p>The TCM2209 chips have a an unusual UART setup for use of more
advanced features. It uses a single wire shared among the TX and RX
lines, and all communication is accomplished by getting or setting
register values. That makes the operation very similar to an I2C
device, but instead of SDA and SCL we have a bidirectional TX/RX pin.</p>
<p>I was proud of myself for doing a good test setup on the basic stepper
control and decided to do something similar with the UART. I set up
the chip on my breadboard and interfaced it with a generic FTDI
UART controller. In this configuration I didn’t even hook up a motor or
the Motor Voltage. I decided there was no point in having all that current
risking damage when I didn’t even have motors plugged in.</p>
<p>That turned out to be a mistake! I wasted a lot of time with the
device infuriatingly not responding at all. Combing over the datasheet
again and again, I decided my problem was the order in which either
the bytes or the bits were set, LSB vs MSB. I even broke out my DSO
Labs logic analyzer and tried to read the signals. And after
exhausting all possibilities I decided to try one last thing, and
added back in the Motor Voltage power from my external power
supply. And things suddenly worked as expected! Reviewing the
datasheet it looks like there is a 5 volt regulator on that side that
powers some of the internals, and the VM power doesn’t simply feed
directly to the motors and nothing else as I thought.</p>
<p>Next up was sorting out how to read and write all the values. I
<strong>did</strong> make a stupid bit order mistake here. Here is the datasheet
entry:</p>
<p><a href="/assets/img/can-crusher/uart-data-structure.png"><img src="/assets/img/can-crusher/uart-data-structure.png" width="100%" /></a></p>
<p>It clearly the bit order, but my mind still interpreted the picture
incorrectly. I also think because I kept comparing this UART protocol
to a poor-man’s I2C interface where the read/write bit is indeed the
Least Significant Bit, it added to my confusion. I decided that I
should send the read address with <code class="language-plaintext highlighter-rouge">(addr << 1)</code> and the write address
with <code class="language-plaintext highlighter-rouge">(addr << 1) + 1</code>. Looking at the middle entry this seemed
correct reading left-to-right, but looking at the section listing with
bits, this is <strong>clearly wrong</strong>. The correct read address is just
<code class="language-plaintext highlighter-rouge">addr</code> and the correct write address is <code class="language-plaintext highlighter-rouge">addr + 128</code> to set the HIGH
bit to 1.</p>
<h1 id="stall-detection">Stall Detection</h1>
<p>With the UART enabled I was ready to tackle stall detection. This took
a lot of trial and error to get right. One thing that’s annoying about
the datasheets that come with many chips, is that they have 100’s of
pages, and are very exhaustive, but they still don’t tell you how to
do the things you actually want to do.</p>
<p>In this case I want a stall to throw the <code class="language-plaintext highlighter-rouge">DIAG</code> pin high so I can
catch it. I was left with this chart:</p>
<p><a href="/assets/img/can-crusher/stall-guard.png"><img src="/assets/img/can-crusher/stall-guard.png" width="100%" /></a></p>
<p>And it seemed like I needed to just set <code class="language-plaintext highlighter-rouge">0x40</code> to an appropriate value
and it would magically work. I tried high values, I tried low values,
I tried middle values, still nothing. After some googling I found some
working reference code. I learned I also needed to set the first
register in the section (<code class="language-plaintext highlighter-rouge">0x14</code>) to enable all the StallGuard
capabilities. My reading of the section made it sound like you only
needed to set <code class="language-plaintext highlighter-rouge">0x14</code> to <strong>disable</strong> functionality in some cases, but
you also need to set it to <strong>enable</strong> it in other cases.</p>
<p>In any case I was able to get working stall detection with some
randomly picked values for both registers. Once I’m further along I’ll
go back to the datasheet and try to calculate some smarter values for
those registers.</p>
<h1 id="uart-mode---device-id-assignment">UART Mode - Device ID assignment</h1>
<p>The UART mode does support having up to 4 TMC2209 chips on the same
bus. In theory you set each one to a unique device ID that is included
in the register requests. Unfortunately this chip reuses the same two
pins that are used to determine the amount of sub-steps that the
driver provides.</p>
<p>If you want two motors to have the same stepping speed
on the same UART bus, you need to either:</p>
<ol>
<li>
<p>Add in some sort of external switching network to activate and
deactivate connections to the UART pins. Or,</p>
</li>
<li>
<p>Make sure the motors aren’t enabled and getting step signals,
change the pin states on each to give them different addresses,
send the appropriate commands, then restore the old state back to
the desired step level.</p>
</li>
</ol>
<p>I went with option two which meant I needed to use 4 more GPIO pins on
the Pi Pico. That took me to a total of 12 GPIO pins just for these
two chips GPIO requirements, and 2 more for the UART connection.</p>
<h1 id="pi-pico-software">Pi Pico software</h1>
<p>I started running my tests on a Pi Pico somewhere in the middle of
testing out UART. I just wrote simple test code to exercise all of the
functionality. I was able to hook up two steppers with lead screws and
do some basic exercising of the motors and stall sensing.</p>
<p>Currently the code sets up the stall detection and then provides a
function to move up or down X number or mm at a rate of Y mm per
second. It’s good enough to write proof of concept homing/leveling
code and test movement.</p>
<h1 id="next-steps">Next steps</h1>
<p>I’ve gotten as far as a Pico program that provides enough control to
test. But my test setup is getting really messy.</p>
<p><a href="/assets/img/can-crusher/test-breadboard.jpg"><img src="/assets/img/can-crusher/test-breadboard.jpg" width="100%" /></a></p>
<p>There are many problems with my current breadboard:</p>
<ul>
<li>My breadboard is getting to be a real mess.</li>
<li>Having a bench power supply for the motors is annoying.</li>
<li>I had to hot-glue down the JST-XH plugs that hold the servo connectors
since they only sort of fit in to the breadboard and would pop out.</li>
<li>I needed to plug and unplug the USB on my Pi Pico to reboot the device to
deploy and run code again.</li>
<li>Wires everywhere, afraid I’m going to somehow mess up the setup when I don’t
notice a wire coming loose.</li>
</ul>
<p>This is all getting in the way of working on the software
development. I want to build out a dev PCB to eliminate most of these
problems. This will make things a lot less fragile, and will add some
features to make it easier for me to redeploy and test code quickly.</p>
<hr />
<p><em>All cad files, electronic files, source code, etc, referenced in this
post series is available on <a href="https://github.com/grant-olson/can-crusher">my github
page</a>.</em></p>
Can Crusher Part 1 - Building the Frame2022-10-03T00:00:00+00:00http://www.grant-olson.net/news/2022/10/03/can-crusher-1<p>I have some free time so I decided to do a project for
fun where the goal was to build something from start-to-finish, doing
as much as possible. All my public projects are software-only. Now
that I’m doing more stuff with robotics, hardware, and consumer
products, I thought it would be nice to do something to showcase all
of those skills.</p>
<p>The project is an aluminum can crusher. Not just any crusher, but the
most technologically advanced robotic can crusher the world has ever
seen!</p>
<p>I hope to:</p>
<ul>
<li>Design all the structural elements myself in a CAD program.</li>
<li>Fabricate it all in-house (literally my house) with 3D printing,
CNC, etc.</li>
<li>Design a custom interface to control stepper motor drivers from a RP2040,
aka Raspberry Pi Pico.</li>
<li>Design a PCB myself to hold the Pi Pico, Stepper Drivers, and
associated electronics.</li>
<li>Get the PCB manufactured and assemble it in-house.</li>
<li>Provide a complete slick industrial-yet-commercial looking product.</li>
</ul>
<p>My biggest concern is wondering if the thing will actually have enough
power to crush cans. But we’ll worry about that later. For now, the
frame…</p>
<h1 id="basic-goals-for-the-frame">Basic Goals for the Frame</h1>
<ul>
<li>
<p>Do a serious CAD project in FreeCAD.</p>
</li>
<li>Don’t just make everything a 3D print:
<ul>
<li>Use Aluminum Extrusions for the frame.</li>
<li>Use CNCed Acrylic for the frame ends for additional strength,
and to get experience with a
home CNC machine I own but haven’t really used yet.</li>
</ul>
</li>
<li>
<p>Use affordable over-the-counter hardware, nothing too extravagant.</p>
</li>
<li>Provide files so someone else can print and build.</li>
</ul>
<h1 id="freecad">FreeCAD</h1>
<p><a href="/assets/img/can-crusher/FreeCadRender.png"><img src="/assets/img/can-crusher/FreeCadRender.png" width="100%" /></a></p>
<p>In that past I’ve used OpenSCAD for personal projects but had mixed
feelings about it. As things got more complex it was hard for me to
visualize and build out objects in code. I decided I would give
FreeCAD a try for this project.</p>
<p>I’d briefly tried using it without much success in the past. The
recommended tutorial didn’t click and it was unclear how I’d turn the
result of a sketch in to a working project. I was probably too
hasty. This time I found an <a href="https://www.youtube.com/watch?v=u8otDF_C_fw&list=PL4eMS3gkRNXcvNnawxzuzRlFDa5CseoQV">excellent series of video tutorials from
Flowwie</a>
that were enough to get up to speed and hit the ground running. I was
surprised how quickly I was able to build out all the parts and
generate STLs for 3d Printing.</p>
<h1 id="aluminum-extrusions">Aluminum Extrusions</h1>
<p>I bought 3 400mm pre-cut 2040 Aluminum Extrusions to provide most of
the strength for the frame. These are easy to find on
Amazon/eBay/Ali-Express and I think I’ll be using them more in the future.</p>
<p>The only trouble is that the holes at the ends of the extrusions
aren’t pre-threaded like they would be in a 3D printer kit. That’s easy
enough to fix with a $10 screw tapping kit with a M5 sized tap. On my
2040 extrusions the hole was already a decent size. There was no need
to drill a pilot hole before using the tap.</p>
<p><a href="/assets/img/can-crusher/thread-tapping.png"><img src="/assets/img/can-crusher/thread-tapping.png" width="50%" /></a></p>
<p>It requires surprisingly little force to tap the ends. The tap will be
a little loose at first since it’s tapered to make it easy to get
started. Once the tap is past the tapered part and not wobbling, I just kept
twisting one full revolution to thread, then one half revolution back
to clear out the aluminum you just cut. The aluminum is surprisingly
soft and can even clump together in balls almost as if it’s melted
together.</p>
<blockquote>
<p><strong>ProTip™:</strong> If you start to feel
more resistance while tapping, don’t power through it! Unscrew the tap a few
times. If that still feels stiff go back-and-forth in the
already-threaded area quickly a few times. That should dislodge any
fragments of aluminum and let you resume tapping.</p>
</blockquote>
<h1 id="3d-prints">3D Prints</h1>
<p>The 3D Printing was straight-forward for me. I’ve done plenty of that
before. I used PETG to get better strength than PLA. Really the only
trick was to make test prints when testing things like screw hole size
and position. Then I could test in 20 minutes instead of waiting 3
hours to find out things were misaligned by 1 mm.</p>
<h1 id="cnc">CNC</h1>
<p>I bought a Genmitsu 3018 PROVER a bit ago that I hadn’t put to much
use. I decided I would use that to mill out a few plates from acrylic
instead of 3D printing. 3D printed parts aren’t always the strongest.
I wanted something more solid than a layered 3D print.</p>
<p>The CNC machine takes a .gcode file, but it’s different enough from 3D
printing that I needed to use a different program to generate it than
I use for my 3D prints. FreeCAD has a <em>Path Workbench</em> that is
supposed to generate good code for CNC purposes. I decided to go with that.</p>
<p>Unfortunately, I lost a day trying to use FreeCAD 20.1 to generate the
required gcode. There were all sorts of problems on multiple machines,
hard crashes, etc. This was a real shame because in general I can’t
recommend FreeCAD enough. Its a great product. Hopefully the bugs will
get worked out. Until then, I was able to use FreeCAD 19.04 to keep
moving. I exported <strong>.step</strong> files and imported them in to a clean
FreeCAD 19.04 project on a different computer.</p>
<p>The next problem was that some test parts had dimensions that were off
by a half millimeter or so. Since this controls the positioning of the
aluminum extrusions, it’s critical that the dimensions be
correct. After some debugging I determined the CNC simply required
manual calibration since it still had its out-of-the-box
settings. After measuring expected vs actual movement I needed to
change the <strong>steps-per-millimeter</strong> settings from the default 800 to
about 794.5 for both the X and Y axis. That’s about 0.5% error but it
does add up over 100 or 200 mm.</p>
<p>After that it was smooth sailing and my CNC was just big enough to
profile out all the parts. And the screw holes lined up perfectly with
a 3D printed base with feet that went under the bottom frame holder.</p>
<p><a href="/assets/img/can-crusher/acrylic-plates.jpeg"><img src="/assets/img/can-crusher/acrylic-plates.jpeg" width="50%" style="float:right;" /></a>
<a href="/assets/img/can-crusher/acrylic-plates.jpeg"><img src="/assets/img/can-crusher/cnc-running.jpeg" width="50%" /></a></p>
<h1 id="assembly">Assembly</h1>
<p>Assembly was straight forward. Just put the parts together and
secure with M5 bolts, either directly in to the tapped aluminum
extrusions, or with some T-Slot Nuts to position things on the length
of the rails. Having a printed base under the structural acrylic with
the bolts running through both worked better than expected. I attached
some NEMA 17 stepper motors temporarily to the motor holders with some
M3 bolts to verify the design.</p>
<p>The results:</p>
<p><a href="/assets/img/can-crusher/final-frame-assembly.jpeg"><img src="/assets/img/can-crusher/final-frame-assembly.jpeg" width="100%" /></a></p>
<h1 id="next-up">Next up…</h1>
<p>A proof of concept stepper motor driver powered by a Raspberry Pi Pico
and some TMC2209 driver boards. These should drive the stepper
motors. I chose TMC2209 drivers so I could try to create a self-homing
machine. This avoids annoying mechanical limit switches that are
difficult to position correctly.</p>
<hr />
<p><em>All cad files, electronic files, source code, etc, referenced in this
post series is available on <a href="https://github.com/grant-olson/can-crusher">my github
page</a>.</em></p>
KC3MLC's Magloop2019-11-12T00:00:00+00:00http://www.grant-olson.net/news/2019/11/12/magloop<p><a href="/assets/img/magloop/4ft_loop.jpg"><img src="/assets/img/magloop/thumbs/4ft_loop.jpg" width="100%" /></a></p>
<p>I finally built my own MagLoop and wanted to share plans, build tips,
theory, and performance with everyone else who is thinking about
trying to make one.</p>
<p>The design is moderately portable and defaults to a 4 foot diameter
loop. In addition, the loop can be swapped out for a smaller 2 foot
loop to work higher frequencies up to 10 meters, and a larger 8 foot
loop to get better efficiency on 30-80 meters.</p>
<p>I’ve made plenty of FT-8 contacts from 15 to 80 meters on four continents
(mine included) and have been happy with the results. I hope someone
else can find it useful too.</p>
<p>Want to skip ahead?</p>
<ul>
<li><a href="#design-goals-and-decisions">Design</a></li>
<li><a href="#build">Build</a></li>
<li><a href="#installation">Installation</a></li>
<li><a href="#tuning">Tuning</a></li>
<li><a href="#performance">Performance</a></li>
</ul>
<h2 id="basic-magloop-theory">Basic Magloop Theory</h2>
<p>One problem I had while learning about magloops was that things just
didn’t make sense! Everything I’ve read before said loop antennas need
to be 1 wavelength loops, 1/2 wavelength dipoles, and 1/4 wavelength
verticals with ground radials. How on earth can a loop that’s only
1/8th to 1/4th a wavelength, fed by another element that’s 1/5th the
size of that (so 1/40th to 1/20th of a wavelength) even work at all?</p>
<p>Lets get the very brief theory of operation out of the way as
painlessly as possible:</p>
<p>Start with the main loop. You have a single loop of wire attached to a
capacitor opposite the driven element. We know that a coil of wire
creates an inductor. In this case the loop is a coil with exactly one
turn and is indeed an inductor! Combined with the capacitor, we now
have a <a href="https://en.wikipedia.org/wiki/LC_circuit">LC circuit</a> which will resonate at appropriate frequencies.</p>
<p>Next up is the driven element, the loop of wire connected to your
transmitter and placed inside the main loop. Once again, even though this
is a single loop, it’s also a 1 loop coil of wire, making
it another inductor. And what
happens when you place two coils of wire next to each other? You get a
<a href="https://en.wikipedia.org/wiki/Transformer">transformer</a>. The driven
element puts out the RF energy, where it is transferred to the loop,
and we have radio waves.</p>
<p>Hopefully that makes things seem slightly less mysterious.</p>
<h2 id="design-goals-and-decisions">Design Goals and Decisions</h2>
<h3 id="it-all-begins-with-an-absurdly-large-capacitor">It All Begins With an Absurdly Large Capacitor!</h3>
<p><a href="/assets/img/magloop/500pf-cap.png"><img src="/assets/img/magloop/thumbs/500pf-cap.png" width="33%" style="float:right;" /></a></p>
<p>As I read up on literature and ran through online magloop calculators,
a recurring theme is that a magloop can generate extremely high
voltages, 3000-5000 volts and more! A standard air variable
capacitor, where you rotate two sets of metal plates, can’t handle
those voltages with even moderate power. This isn’t just a case of
the literature being overly conservative. A previous magloop I build
with such a capacitor would generate blue electric arcs when my
transmitter hit even 15 to 20 watts.</p>
<p>I’m only planning to run 100 watts, but needed a capacitor that could
handle more power. I needed a vaccum-sealed capacitor which removes
the easy path for high voltage electric arcs between plates. The best way to get
one that can do this and is affordable to the hobbyist is to order old
Soviet-Era surplus vacuum variable capacitors off of eBay.</p>
<p>As I browsed the various listings from sellers in Ukraine and the
Russian Federation, I kept up-selling myself. <em>Only $10 more to handle
10,000 volts! Sold!</em> <em>Only another $20 for another 100 picoFarads!
Done!</em> And before I knew it I had purchased the biggest monstrosity of
a variable capacitor that the finest minds in Soviet engineering could
produce: A 10-500 picoFarad capacitor with a 10,000 kV rating. I
didn’t realize how big I had gone until the thing arrived. It was
huge! 10 inches long, five inches wide each way, and weight in the
range of 6 pounds!</p>
<p>This forced me to change my initial design. Most loop builders indicate
that they’ve gotten better results with the capacitor at the top of
the loop and the driven element at the bottom. That was out based on
the weight. And I originally hoped to get some height off the ground
with a simple mast or tripod. Things would also be too top heavy for
that.</p>
<p>I decided to turn the size of the capacitor from a weakness to a
strength. Instead of attaching to a mast, I would create a base unit
that could stand on its own. The weight of the capacitor itself would
help stabilize the antenna. The base could then be placed on a picnic
table, the roof of a <em>parked</em> car, or even an upside down bucket for actual
usage.</p>
<h3 id="portable">Portable</h3>
<p>I wanted my design to be portable in several ways. Many designs out
there involve making an 8 or 16 foot high octagon out of copper pipe
that has been braised together! I wanted something that I could throw
in to my car and hopefully try a POTA excursion one day. And I at
least wanted to be able to get the thing inside my house or garage
without having to take a cutting torch to it!</p>
<p>Based on this, the main loop is just good old RG-213 coax with the
shield acting as the loop. I can coil up the loop when not in use. As
an added bonus, this allowed me to make different swappable variations
of loop sizes easily.</p>
<h3 id="temporary">Temporary</h3>
<p>This antenna is intended to be used on site temporarily. It’s not intended to
be mounted permanently. As such I tried to make it a little weather
resistant in case there’s some light rain, but made no attempt to make
it fully water or wind proof. If storms are coming the antenna goes inside.</p>
<h3 id="no-motors">No Motors</h3>
<p>Many designs include a motor that spins the tuning capacitor at a
distance. Some include a rotator to take advantage of the antenna
directionality.</p>
<p>I had serious problems with a motor attached to a smaller capacitor on
a previous antenna. It caused all kinds of stray capacitance and
bizarre changes to SWR at random inexplicable times. I didn’t know if
it was the control cable for the motor, the coils of wire in the
motor, the connection to the capacitor, or what.</p>
<p>For now I will tune the antenna by hand, and position it by hand. This
can be a little annoying, but I don’t intend to hunt-and-seek
contacts. I plan to sit on a frequency, such as an FT8 frequency, or
possibly calling CQ on a single frequency in a future POTA
excursion. And running FT8 has worked just fine.</p>
<p>I’ll probable reconsider this at some point, but at least then I will
have a good handle on my baseline expectations for the antenna and a
better feel for if and what problems are caused by a new motorized
attachment.</p>
<h2 id="build">Build</h2>
<p><a href="/assets/img/magloop/capacitor.jpg"><img src="/assets/img/magloop/thumbs/capacitor.jpg" width="50%" style="float:right;" /></a></p>
<h3 id="capacitor-mounting-and-base">Capacitor Mounting and Base</h3>
<p>As I mentioned, when my capacitor from Ukraine arrived after six long
weeks, it was bigger than I expected by far! But I was ready to get to
work. So I went to my local big box hardware store and found a 12 inch
by 12 inch plastic electrical junction box. If I had been more
patient, I probably could have found something cheaper.</p>
<div style="clear:both;" />
<p>To make the basic base first mount the SO-239 adapters:</p>
<ol>
<li>Mark the centers of the adapters on the outside of the base.</li>
<li>Drill pilot holes.</li>
<li>Drill out the large holes until the threaded part of the SO-239 can
fit through them.</li>
<li>Insert the adapter through the hole backwards, so it faces
<strong>inside</strong> the case. Mark the location of the four mounting holes in
the adapter. Remove the adapter.</li>
<li>Drill the mounting holes.</li>
<li>Place the adapters in the correct way and mount them with nuts and
bolts.</li>
<li>Use one extra long bolt on each adapter so you can eventually wire
up the capacitor.</li>
</ol>
<blockquote>
<p><strong>ProTip™:</strong> If you’re starting to make a bunch of amateur
radio gear and you don’t have a <a href="https://www.google.com/search?q=step+drill+bit">STEP DRILL BIT</a> you’ll want to get
one as soon as possible. It allows you to drill large holes for
things like SO-239 adapters without having to switch out bits
repeatedly and without melting ABS plastic. A set is a few dollars
and probably available at your local Harbor Freight.</p>
</blockquote>
<p><a href="/assets/img/magloop/ubolts.jpg"><img src="/assets/img/magloop/thumbs/ubolts.jpg" width="15%" style="float:right;" /></a>
To make the support holder:</p>
<ol>
<li>Drill three sets of holes: top, middle, and bottom, that will each
fit a 1 inch U-bolt.</li>
<li>Insert the U-Bolts and thread them in from the inside.</li>
</ol>
<div style="clear:both" />
<p>To attach the capacitor:</p>
<ol>
<li>Cut two 8-inch lengths of 14 Gauge insulated stranded wire.</li>
<li>Strip a 1/2 inch or so off of each wire.</li>
<li>On the long bolt on each adapter:
<ul>
<li>Add one washer.</li>
<li>Wrap the exposed wire around.</li>
<li>Add another washer.</li>
<li>Add another bolt and tighten to get a good connection.</li>
</ul>
</li>
<li>Take two hose clamps and put them on the ends of the
capacitor. Screw them down until there’s a half inch of slack.</li>
<li>Place the capacitor in the base and position it.</li>
<li>Trim and strip the other ends of the 14 gauge wires so that they
will reach the capacitor and have enough stripped wire to wrap
around the hose clamps twice.</li>
<li>Remove hose clamps from the capacitor, wrap the wire around twice,
re-attach and screw them down to get good contact between the wire
and capacitor.</li>
</ol>
<p>This completes a usable base.</p>
<p><a href="/assets/img/magloop/hose_clamp.jpg"><img src="/assets/img/magloop/thumbs/hose_clamp.jpg" width="50%" style="float:right;" /></a>
<a href="/assets/img/magloop/attached_wire_cropped.jpg"><img src="/assets/img/magloop/thumbs/attached_wire_cropped.jpg" width="50%" /></a></p>
<div style="clear:both" />
<p>This was enough to start using things, but it eventually
became frustrating to tune the capacitor as it wobbled around. To
reduce the wobbling, I bought a piece of 6 inch wood, and cut it it
fit inside the box. I then used some old nylon straps to secure the
capacitor to the board with a few screws.</p>
<blockquote>
<p><strong>ProTip™:</strong> Often times when you’re reading antenna plans
you’ll be presented with a detailed manifest and list of parts. This
is nice to be exhaustive, but often times leads to the impression
that certain parts were carefully spec’ed out and tested, rather
than just being materials available to the author.</p>
<p>The wooden base for my capacitor is just such a thing. It was
wobbling, I wanted to stop it, and I didn’t want to use metal or
drill holes in the exterior case. This was the solution I came up
with. If I had a 3D printer I probably would have come up with
something better. I’m hesitant to give detailed instructions since
it was hacked together.</p>
<p>In short, feel free to improvise with any element of the plans
presented here, and particularly with the capacitor support which
I just threw together in a weekend. Even now I find myself wondering
if I could make a base out of a beer cooler that’s big enough to
hold all the cables and supports.</p>
</blockquote>
<h3 id="pvc-supports">PVC Supports</h3>
<p><a href="/assets/img/magloop/pvc_supports.jpg"><img src="/assets/img/magloop/thumbs/pvc_supports.jpg" width="35%" style="float:right;" /></a> I
made a set of composable PVC support
pipes that would allow me to
easily set up loops of either 2, 4, or 8 foot diameters. My big box
hardware store had pre-cut 2 foot sections of pipe, and I went for
this rather than cut them myself. I also bought adapters to get the
following combinations. The pipes were 3/4 inch and had an outer
diameter of 1 inch.</p>
<ul>
<li>
<p>One pipe with a four way adapter attached to one end which I’ll
refer to as the <strong>crossbeam</strong>.</p>
</li>
<li>
<p>Two pipes with T adapters to support the sides of the RG-213 loop
which I’ll refer to as the <strong>side supports</strong>.</p>
</li>
<li>
<p><a href="/assets/img/magloop/pvc_top_support.jpg"><img src="/assets/img/magloop/thumbs/pvc_top_support.jpg" width="20%" style="float:right;" /></a> One pipe with a modified T adapter to support the top of the loop
which I’ll refer to as the <strong>top support</strong>. A
slot was cut in to this so the main loop, along with the driven
element could be hung on the adapter without having to pass through
it. To do this I put the adapter in a vice and cut it with a hacksaw
<em>before</em> attaching it to the pipe section.</p>
</li>
<li>
<p>Four pipes with a standard coupling attached to one end which I’ll
refer to as the <strong>extensions</strong>.
extenders.</p>
</li>
</ul>
<p>The attachments were glued on with PVC primer and glue so that when I
disassemble the supports the right parts stay together and the wrong
parts don’t get stuck.</p>
<blockquote>
<p><strong>ProTip™:</strong> <strong>DO THIS
OUTSIDE!</strong> I made the mistake of doing this on our enclosed porch, and
fumes still managed to pervade the entire first floor of our house. We
had to open the windows and air it all out. It is highly recommended
that you have more ventilation.</p>
</blockquote>
<h3 id="loops">Loops</h3>
<p>I will generally refer to the loops by diameter which is also the height of the
PVC supports. These are nice round numbers where the actual size of
the loop is diameter times <em>Pi</em>. In addition there is some amount
subtracted from the ideal cable length to account for the area of the
loop where the capacitor sits and there isn’t any wire.</p>
<p>The loops are made of RG-213 coax with PL-259 plugs attached at each
end. This allows me to insert the loop into the support frame and
screw it in to the base.</p>
<p>The actual cable length was:</p>
<ul>
<li><strong>2 Foot Loop</strong> - 5 foot, 9 inches</li>
<li><strong>4 Foot Loop</strong> - 12 foot, 8 inches</li>
<li><strong>8 Foot Loop</strong> - 23 foot, 6 inches</li>
</ul>
<p>The 8 foot loop ended up being slightly shorter because it was
drooping more, and I went with more of a diamond shaped loop to avoid this.</p>
<p>While building your own antenna, rather than measuring out
to these lengths,you should attach a PL-259 adapter to one end of the
coax, feet it through the supports and screw it in to one side of your
base, and then mark off
the appropriate place to cut the other end of the cable. This will
give a better loop if your project box or SO-239 placement is
different than mine.</p>
<h3 id="driven-elements">Driven Elements</h3>
<p><a href="/assets/img/magloop/driven_element.jpg"><img src="/assets/img/magloop/thumbs/driven_element.jpg" width="50%" style="float:right;" /></a></p>
<p>The driven elements are made from RG-8X and should be 1/5 the size of
the main loop coax. I did not try to factor in the full loop size including
capacitor and just used the physical coax length divided by 5. There
also needs to be a connector to hook the driven element up to your
feed line, so when you cut the cable add 6 to 8 inches or more
depending on how confident you are about getting the coax stripped and
adapter installed on the first try. I used a female BNC adaptor but
feel free to use whatever adapter you want.</p>
<p>There are many confusing designs available<sup id="fnref:driven_element" role="doc-noteref"><a href="#fn:driven_element" class="footnote" rel="footnote">1</a></sup> for the driven element. In
the end I chose the least confusing one. This involved soldering the
<strong>center conductor</strong> of the coax to the <strong>shield</strong> where the loop will
be the appropriate size.</p>
<p>To make one:</p>
<ol>
<li>Cut a cable with an extra 6 to 12 inches to mount a connector to
the feedline.</li>
<li>Strip one end of the coax with a standard stripping tool
providing enough exposed center conductor to wrap around the cable.</li>
<li>Cut away the exposed copper shield at the end so it doesn’t
accidentally contact anything it shouldn’t.</li>
<li>
<p>Measure back to the appropriate length and carefully remove 1
cm of casing without damaging the copper shield.</p>
<p>I found It was best to score the jacket on each side creating
visible cut lines without cutting all the way to the copper, cut a slot
out between the two lines, and then peel the rest of the jacket off.</p>
</li>
<li>Wrap the exposed center conductor around the exposed shield
and solder it in place.</li>
<li>Use
<a href="https://www.google.com/search?q=rescue+tape&oq=rescue+tape">silicone self-amalgamating tape</a>
to seal the connection. I imagine heat shrink tubing or electrical
tape instead would also be fine.</li>
<li>Connect a BNC Female connector or adapter of your choice to the exposed end.</li>
</ol>
<h2 id="installation">Installation</h2>
<p>At this point you should be ready to do an initial smoke test. I would
recommend using the 4 foot antenna and setting the assembly on a table
in your shack
before trying to use it outside.</p>
<h3 id="setting-up-the-antenna">Setting up the antenna</h3>
<p>I would suggest starting with the 4 foot loop, which can get good
reception on 20, 40, and 80 meters. I would also suggest using a
workspace such as a table in your hamshack since it will
take some time to perform initial configuration of the driven
element. Once the driven element is configured it will be easier to
set up in a more desirable location.</p>
<p>To assemble in 4 foot mode:</p>
<ol>
<li>Add the two side supports and top support to the crossbeam.</li>
<li>Insert the crossbeam in to the U bolts and tighten the wing nuts to
hold it in place.</li>
<li>Hang the loop cable inside the top support, and thread the ends
through the side supports.</li>
<li>Screw the PL-259 connectors in to the base unit.</li>
<li>Attach the feed line to the driven element.</li>
</ol>
<p>Initial configuration the <strong>first time only</strong>:</p>
<ol>
<li>Insert the driven element in to the top support under the main loop.</li>
<li>Use an antenna analyzer to roughly adjust the tuning capacitor to
the desired frequency.</li>
<li>Experiment with positioning as described in the
<a href="#tuning-the-driven-element">tuning section</a> securing the driven
element with temporary tape.</li>
<li>Once you find a workable position, permanently tape it the driven
element to the main loop in a way both pieces of coax can easily be
placed on the top support.</li>
</ol>
<p>It should loop like this:</p>
<p><a href="/assets/img/magloop/4ft_assembled.jpg"><img src="/assets/img/magloop/thumbs/4ft_assembled.jpg" /></a></p>
<p>After initial testing you may want to set up the two foot loop. The
procedure is basically the same, but you’ll use the Top Support only
on the base:</p>
<p><a href="/assets/img/magloop/2ft_loop.jpg"><img src="/assets/img/magloop/thumbs/2ft_loop.jpg" /></a></p>
<h3 id="positioning">Positioning</h3>
<p>To get lowest SWR, the loop must be off of the ground. While testing
on my back porch, I just sat it on a bar-height table. When I use the
antenna outside, I set it on an upside-down utility bucket which is
about 18 inches high. Any lower than that and the SWR started creeping
up again. I suspect even higher is still better, as you’ll get better
angles for DX takeoff, but the bucket was adequate to hit Europe and
South America from Pittsburgh.</p>
<p>The antenna also has some directionality, with the strongest signal
shooting off of the ends of the loop. It’s only somewhat directional,
so I generally just point it either North/South or East/West and don’t
try to dial in on an exact bearing.</p>
<h3 id="tuning">Tuning</h3>
<h4 id="tuning-the-capacitor">Tuning the Capacitor</h4>
<p>Initial tuning is done by adjusting the variable capacitor. I attached
a key ring to mine to make it easier to turn. When the capacitor is
fully retracted and has the <em>least</em> amount of capacitance, you’ll be
closest to the ideal efficiency<sup id="fnref:efficiency" role="doc-noteref"><a href="#fn:efficiency" class="footnote" rel="footnote">2</a></sup> for the given loop size. At this
point, very small changes in capacitance will have a large impact on
the frequency sweet spot. As you add more capacitance and the
frequency goes down, you’ll find that the bandwidth decreases and
that you’ll need to move the capacitor more to move the optimal frequency.</p>
<h4 id="tuning-the-driven-element">Tuning the Driven Element</h4>
<p>When you initially tune the antenna, you’ll probably have a poor
best-case SWR. Once you’re within range of the frequency you want to
transmit on, you’ll need to adjust the driven element positioning to
get the best possible SWR. This is done through experimentation and in
my experience relies on two factors:</p>
<ol>
<li>How much of the top of the driven element contacts the loop.</li>
<li>How far up or down the bottom of the driven element is from the
main loop.</li>
</ol>
<p>You’ll need to experiment for yourself. Here are some notes on what
worked for me. These are not intended to be prescriptive; you’ll need
to find your own positioning. They will just give you a starting point
for adjustments to try:</p>
<ul>
<li>
<p>The driven elements should be close to the plane of the main loop,
but don’t worry about the support pipes preventing you from getting
that last 1/2 inch.</p>
</li>
<li>
<p><a href="/assets/img/magloop/2ft_driven_element.jpg"><img src="/assets/img/magloop/thumbs/2ft_driven_element.jpg" width="15%" style="float:right;clear:both;overflow:auto;" /></a> My driven element for the 2 foot loop most resembles what you’ll see
in diagrams and is a nice round circle attached to the top of the
antenna. Even still, some movement up and down on the bottom half
helps as I switch between various bands.</p>
</li>
<li>
<p><a href="/assets/img/magloop/4ft_driven_element.jpg"><img src="/assets/img/magloop/thumbs/4ft_driven_element.jpg" width="15%" style="float:right;clear:both;overflow:auto;" /></a> My driven element for 4 foot worked best with the most surface area
possible attached to the main loop, and the bottom half located
extremely high, making a crescent shape.</p>
</li>
<li>
<p><a href="/assets/img/magloop/8ft_driven_element.jpg"><img src="/assets/img/magloop/thumbs/8ft_driven_element.jpg" width="15%" style="float:right;clear:both;overflow:auto;" /></a> The driven element for the 8 foot worked best
in a kite shape, with
barely any contact on the top of the loop. This also benefited from
adjusting the position of the bottom on different frequencies. In
general as the frequency goes down, pulling the bottom down helped
on both this and the two foot antenna.</p>
<p>As pictured, you can see
velcro holding the loop in a position which is best for 40 and 80
meters, and I move it up for 30 meters.</p>
</li>
</ul>
<h2 id="setting-up-the-8-footer">Setting up the 8 footer</h2>
<p><a href="/assets/img/magloop/8ft_front.jpg"><img src="/assets/img/magloop/thumbs/8ft_front.jpg" /></a></p>
<h4 id="tension-lines">Tension lines</h4>
<p>The 8 foot loop requires both tension lines to hold the PVC supports
in place. Drill 1/4 inch holes in the support element that could pass
through paracord. Holes should be drilled through both sides of the
pipe so the cord can be run through the pipe.</p>
<p>Hole Placement:</p>
<ul>
<li>
<p><strong>Side and Top Supports</strong> A hole close to and parallel with the T
Adapters.</p>
</li>
<li>
<p><strong>Crossbeam Support</strong> Two sets of holes, near the base with enough
room for the extension element, perpendicular to each other with a
half inch space between them.</p>
</li>
<li>
<p><strong>Three Extensions</strong> One hole drilled all the way through near the
coupling.</p>
</li>
<li>
<p><strong>The Base Extension</strong> Two sets of holes like the crossbeam support,
but located so that they are <em>above</em> the capacitor housing so it
doesn’t interfere with the tension ropes.</p>
</li>
</ul>
<p>Review the picture of the assembled antenna if my placement directions
don’t make sense.</p>
<p>Cut appropriate lengths of rope and melted the ends shut with a
grill lighter. If you smell burning plastic and see smoke, you’ve
melted them too much. Wrapping the cord around the pipe takes
surprising amounts or cord, so start off with more cord than you think
you’ll need and trim it back after your initial assembly.</p>
<p>First assemble the inner set of supports:</p>
<ol>
<li>Attach the crossbeam
and three extensions with normally placed holes.</li>
<li>Tie a knot in one end of the paracord and insert it through one set of holes at the base
of the crossbeam.</li>
<li>Work through the other three supports running the line through the
hole, pulling it tight, and tying a
<a href="https://www.netknots.com/rope_knots/clove-hitch">clove hitch</a>.</li>
<li>Once these three supports are secure, run the line through
the second set of holes on the crossbeam, tie a
<a href="https://www.netknots.com/rope_knots/tautline-hitch">tautline hitch</a>, and
pull it tight.</li>
</ol>
<p>The end supports can be attached and tied with the same procedure.
It is best
to (1) do this outdoors, and (2) only insert the supports one at a time as you’re
ready to tie them in place.</p>
<p>Next mount the support in to the base and tighten
down the U Bolts. Although you’ll need guy lines for long term
installation, this will usually sit fine on flat land without tipping
over. Still, release the supports slowly as you let go the first time.</p>
<blockquote>
<p><strong>ProTip™:</strong> Install the main loop and attach the driven element to the feed
line before inserting the support in to the base.</p>
</blockquote>
<h4 id="guy-lines">Guy lines</h4>
<p>The 8 foot loop also needs guy lines. To prepare the guy lines, cut paracord to three appropriate
lengths, melt the ends closed, and tie a <a href="https://www.netknots.com/rope_knots/bowline">bowline knot</a> on one
end.</p>
<p>To install the guy lines loop them around the center mast, placed the
antenna some sort of support, secure the lines to stakes with
<a href="https://www.netknots.com/rope_knots/tautline-hitch">tautline hitches</a>,
and apply tension.</p>
<p>A helper is extremely useful here especially when the land is not
flat, but you can usually find one line to secure first while holding
it tight and then go back and get the other two lines in to place
after the fact.</p>
<p>For the initial driven element configuration, you will probably also
want a step stool to find optimal placement.</p>
<h2 id="performance">Performance</h2>
<blockquote>
<p><strong>ProTip™:</strong> <a href="https://pskreporter.info/pskmap.html">pskreporter.info</a> provides
a great way to see how your antenna is doing when you’re working
digital modes. Various stations send signal reports to a centralized
location where you can view how well things are propagating, even
when people are ignoring your CQ calls.</p>
<p>Note that if you want to help out others and send your FT8 reception
reports, you must enable this in WJST-X settings under <strong>Reporting</strong>.</p>
</blockquote>
<p>All results shown are from grid square EN90 running a Yaesu FT-857d at
100 Watts, operating in November 2019 in the deepest darkest reaches
of the end of <a href="https://en.wikipedia.org/wiki/Solar_cycle_24">Solar Cycle 24</a>.</p>
<table class="magloop">
<tr><th style="width:10%;">Band</th><th style="width:45%"> </th><th style="width:45%"> </th></tr>
<tr><th>15</th>
<td>
<a href="/assets/img/magloop/15m_2ft_morning.png"><img src="/assets/img/magloop/thumbs/15m_2ft_morning.png" /></a>
<small>2 Foot / North-South / Morning</small>
<br /><small>QSOs with Germany, France,
and Italy, along with a new DXCC entry at Bosnia-Herzegovina.</small>
</td>
<td>
</td></tr>
<tr><th>20</th>
<td>
<a href="/assets/img/magloop/20m_bearing_zero_alaska.png"><img src="/assets/img/magloop/thumbs/20m_bearing_zero_alaska.png" width="100%" /></a>
<small>4 foot / North-South / Afternoon</small><br /><small> First
Alaska contact and a couple hits in Brazil.</small>
</td>
<td>
<a href="/assets/img/magloop/20m_2ft_dusk.png"><img src="/assets/img/magloop/thumbs/20m_2ft_dusk.png" /></a>
<small>2 foot / North_South / Dusk.</small><br /><small>Not as bad as expected.</small>
</td>
</tr>
<tr><th>30</th>
<td>
<a href="/assets/img/magloop/30m_8ft_ew.png" width="100%"><img src="/assets/img/magloop/thumbs/30m_8ft_ew.png" width="100%" /></a>
<small>8 foot / East-West / Afternoon.</small>
<br /><small>QSOs with Italy, Croatia, Crete, and
the Balearic Islands.</small>
</td>
<td>
<a href="/assets/img/magloop/30m_4ft_ne.png" width="100%"><img src="/assets/img/magloop/thumbs/30m_4ft_ne.png" width="100%" /></a>
<small>4 foot / NorthEast-SouthWest / Afternoon.</small>
<br /><small>Pointing directly to London helps us getting to Europe, but still not as far inland as the 8 foot loop.</small>
</td></tr>
<tr><th>40</th>
<td>
<a href="/assets/img/magloop/40m_8ft_morning.png"><img src="/assets/img/magloop/thumbs/40m_8ft_morning.png" /></a>
<small>8 Foot / East-West / Morning.</small>
<br />
<small>We are getting further than ground-wave, but only high angles
are making it back, giving an effective radius of somewhere between
750-1000 miles with a few outliers.</small>
</td>
<td>
<a href="/assets/img/magloop/40m_8ft_sunset.png"><img src="/assets/img/magloop/thumbs/40m_8ft_sunset.png" /></a>
<small>8 Foot / East-West / Sunset</small>
<br />
<small>Now we're seeing good results to Europe!</small>
</td></tr>
<tr><th>80</th>
<td>
<a href="/assets/img/magloop/80m_8ft_night.png"><img src="/assets/img/magloop/thumbs/80m_8ft_night.png" /></a>
<small>8 Foot / East-West / 10 PM </small>
</td><td></td></tr>
</table>
<h2 id="reference">Reference</h2>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:driven_element" role="doc-endnote">
<p><a href="https://www.nonstopsystems.com/radio/frank_radio_antenna_magloop.htm#induct">Inductive Coupling Designs</a>
This article has a wealth of information on all things magloop, with a
particularly detailed description of various ways to drive the magloop. <a href="#fnref:driven_element" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:efficiency" role="doc-endnote">
<p><a href="http://www.66pacific.com/calculators/small-transmitting-loop-antenna-calculator.aspx">Online Efficiency Calculator</a>
and
<a href="http://www.aa5tb.com/loop.html#cal">AA5TB’s Excel Application</a>
allow you to estimate how much of your power makes it to the airwaves
for a given loop and frequency. <a href="#fnref:efficiency" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Using MiniCom with FlashForth on OSX2019-09-04T00:00:00+00:00http://www.grant-olson.net/news/2019/09/04/using-minicom-with-flashforth-on-osx<p>I’ve been spending some of my free time lately playing with Arduinos and
Raspberry Pis. They are dirt cheap, have GPIO port exposed, and there are
plenty of peripheral devices available from China for as low as 1 or 2
dollars. Controlling the hardware is different enough from my day job it
makes a great engineering hobby.</p>
<p>The Arduino IDE is nice
if you’re trying to bring up some hardware quickly, but I was a little annoyed
at the way all the actual hardware access was hidden behind library code that
you don’t see in day-to-day development. My adventures took me to <a href="http://flashforth.com/">FlashForth</a>, an implementation
of Forth targeted for the microprocessors that run the Arduinos. Forth has
always been on my languages-to-learn list. It’s supposed to be a high-level
language that can be implemented in a small footprint (as small as 8K!), has interactivity,
and gives you direct access to the bare metal. In the past I’d just be using it
in a sandbox without touching hardware, but now I’ve found a good excuse to
play around with it on these devices with only 32k of memory and utilize the
bare metal access.</p>
<p>Things had been going well as I worked on tutorials and got the basics of
Forth down, but came to a screeching halt when I tried to start developing a
real project (an sd card driver) and was writing my code in files and
sending various code blocks to the device as fast as my computer would let me.
FlashForth would just start printing a bunch of <code class="language-plaintext highlighter-rouge">|||||</code> and wouldn’t receive
the input. This problem is documented on the homepage:</p>
<blockquote>
<p>Normally communication with the PC and writing to flash works very reliably, but…</p>
<p>If to you see a vertical bar <em>|</em> output from FlashForth, it means that the
UART RX interrupt buffer has overflowed.</p>
<p>It is usually caused by the PC reacting slowly on XOFF.
<code class="language-plaintext highlighter-rouge">setserial /dev/ttyS0 low_latency</code> improves the situation on Linux.</p>
<p>On Windows, disabling the UART buffers improves the situation.
Another alternative is to use TeraTerm with an intercharacter delay of a few milliseconds.</p>
</blockquote>
<p>Unfortunately the fixes were for Windows and Linux, but nothing for OSX. The <code class="language-plaintext highlighter-rouge">setserial</code>
command isn’t included on OSX. I eventually decided I needed to find a terminal program
that ran on OSX and allowed you to set the above listed <em>intercharacter delay of a few milliseconds</em>.
This proved easier said than done. After spending time trying a plethora of programs, both
Open Source and commercial demos, I finally found out that <strong>minicom</strong> could do what I needed,
and was available via brew.</p>
<p>Configuration is a little tricky though. There were a few settings I needed to change to make
things work. And unfortunately the pause-between-characters setting doesn’t get saved in
configuration, so I need to set that up every time I fire up the app. But once I’m up
and running it works well. Here are the two phases of setup:</p>
<h2 id="initial-configuration">Initial Configuration</h2>
<ol>
<li>Install minicom: <code class="language-plaintext highlighter-rouge">brew install minicom</code>.</li>
<li>Start minicom: <code class="language-plaintext highlighter-rouge">minicom -b 38400 -d /dev/cu.YOURDEVICE</code></li>
<li><code class="language-plaintext highlighter-rouge">ESC-Z</code> brings up menu.
<ol>
<li><code class="language-plaintext highlighter-rouge">O</code> for <code class="language-plaintext highlighter-rouge">cOnfigure Minicom</code>.</li>
<li>Down arrow to <code class="language-plaintext highlighter-rouge">Serial port setup</code> hit <code class="language-plaintext highlighter-rouge">ENTER</code>.
<ol>
<li><code class="language-plaintext highlighter-rouge">E</code> to set baud rate.</li>
<li>Optionally <code class="language-plaintext highlighter-rouge">A</code> to set serial device, but mine changes enough I use the <code class="language-plaintext highlighter-rouge">-d</code> command line flag.</li>
<li><code class="language-plaintext highlighter-rouge">ESC</code> to go up a menu level.</li>
</ol>
</li>
<li>Down arrow to <code class="language-plaintext highlighter-rouge">Screen and keyboard</code> hit <code class="language-plaintext highlighter-rouge">ENTER</code>.
<ol>
<li><code class="language-plaintext highlighter-rouge">P</code> to set <code class="language-plaintext highlighter-rouge">Add linefeed</code> to <code class="language-plaintext highlighter-rouge">No</code> so you don’t double space.</li>
<li><code class="language-plaintext highlighter-rouge">R</code> to turn <code class="language-plaintext highlighter-rouge">Line Wrap</code> on so a command like <code class="language-plaintext highlighter-rouge">words</code> doesn’t run off the page.</li>
<li><code class="language-plaintext highlighter-rouge">ESC</code> to go up a menu level.</li>
</ol>
</li>
<li>Down arrow to <code class="language-plaintext highlighter-rouge">Save setup as dfl</code> hit <code class="language-plaintext highlighter-rouge">ENTER</code>.</li>
</ol>
</li>
</ol>
<p>At this point all of the savable settings are stored in your config for later. We still haven’t solved
the original problem of adding a delay though. This will need to be done every time at minicom startup.</p>
<h2 id="add-intercharacter-delay">Add intercharacter delay</h2>
<ol>
<li><code class="language-plaintext highlighter-rouge">ESC-Z</code> brings up menu.</li>
<li><code class="language-plaintext highlighter-rouge">T</code> for <code class="language-plaintext highlighter-rouge">Terminal Settings</code>.</li>
<li><code class="language-plaintext highlighter-rouge">F</code> for <code class="language-plaintext highlighter-rouge">Character tx delay (ms)</code>.</li>
<li>I went with <code class="language-plaintext highlighter-rouge">10</code> ms for best results on my system. Lower numbers may work for you.</li>
<li><code class="language-plaintext highlighter-rouge">ESC</code> to leave menu and return to main screen.</li>
</ol>
<p>At this point you should be able to safely paste large chunks of code for processing by the FlashForth
interpreter.</p>
<p>Have Fun!</p>
<p>Now I just need to work on direct Emacs integration instead of using <code class="language-plaintext highlighter-rouge">Ctrl-C</code> <code class="language-plaintext highlighter-rouge">Ctrl-V</code>. If anyone has
tips let me know.</p>
Help! Google Adwords API Keys Stopped Working August 22nd!2018-08-23T00:00:00+00:00http://www.grant-olson.net/news/2018/08/23/help-google-adwords-api-keys-dont-work<p>[TLDR? Fix is at bottom of page.]</p>
<p>We just spent two days debugging a problem with our google adwords API
keys and finally got things working. We’re not sure if this problem is
affecting people globally, but it was particularly difficult to debug
so I wanted to get this information out there. Let me know if it
helped you out.</p>
<p>We’ve been using API keys to generate google ads for several years
now. If you’re using them, you have a basic understanding of how
things work in ruby-land. There is a configuration file <code class="language-plaintext highlighter-rouge">adwords_api.yml</code> that you
set up with basic values, then run the <code class="language-plaintext highlighter-rouge">setup_oauth2.rb</code> script
included in the github repository for Google’s gems. This has you do
browser-based authentication, and then the file will have a refresh
token. When this file is re-used, the refresh token is used to request
an access token as needed, and this access token is used to generate
ads.</p>
<p>Yesterday morning, things blew up horribly and all of our processes
generating ads failed! We went into major fire-fighting mode did some
basic debugging. We decided that a three year old refresh token might
be the problem, and regenerated one locally, and things seemed to be
working again. But then, after an hour, all our processes would blow
up again. We could fix the tokens every hour, but this obviously
wasn’t a full-time solution.</p>
<p>As we tracked things down and went through many red herrings, we came
to realize that normally Google’s code would figure out that the
current access token was expired, and would request a new one via the refresh
token. As we dug in to the google code, we finally made our way to the
<a href="https://github.com/googleads/google-api-ads-ruby/blob/master/ads_common/lib/ads_common/auth/oauth2_handler.rb">oauth2_handler.rb file in ads_common</a>. In
particular the <code class="language-plaintext highlighter-rouge">get_token</code> method on line 89:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Overrides base get_token method to account for the token expiration.</span>
<span class="k">def</span> <span class="nf">get_token</span><span class="p">(</span><span class="n">credentials</span> <span class="o">=</span> <span class="kp">nil</span><span class="p">,</span> <span class="n">force_refresh</span> <span class="o">=</span> <span class="kp">false</span><span class="p">)</span>
<span class="n">token</span> <span class="o">=</span> <span class="k">super</span><span class="p">(</span><span class="n">credentials</span><span class="p">)</span>
<span class="n">token</span> <span class="o">=</span> <span class="n">refresh_token!</span> <span class="k">if</span> <span class="o">!</span><span class="vi">@client</span><span class="p">.</span><span class="nf">nil?</span> <span class="o">&&</span>
<span class="p">(</span><span class="n">force_refresh</span> <span class="o">||</span> <span class="vi">@client</span><span class="p">.</span><span class="nf">expired?</span><span class="p">)</span>
<span class="k">return</span> <span class="n">token</span>
<span class="k">end</span>
</code></pre></div></div>
<p>Using <code class="language-plaintext highlighter-rouge">bundle open google-ads-common</code> we were able to go in and change
<code class="language-plaintext highlighter-rouge">force_refresh</code> to default to true. Lo and behold, authentication was
working! So forcing the creation of a new token solved the immediate
problem, but we still wanted to have a better idea of what was
happening, and we were reluctant to monkey-patch Google’s gem only to
have things break in future versions.</p>
<p>There was one interesting thing we noticed as we looked through
various SOAP output. Getting back to the original inputs, there are
two values attached to the access token, the creation date, and the
time, in seconds, until it expires. As we looked at the output, we
noticed that it was returning <code class="language-plaintext highlighter-rouge">expires_in</code> values that were actually
counterintuitively <em>increasing</em> rather than decreasing. We would have
expected a key that started at 3600 to, when called a minute later, to
return 3540. Instead it was returning 3660, and climbing up until
hitting 7200 one hour later, at which point the access token would be
expired, but our code would not generate a fresh token, and our app would
start blowing up.</p>
<p>Unfortunately, we were unable to tell if the value always worked this
way, or if there was a new breaking change introduced yesterday
morning when we first encountered errors.</p>
<p>We were reluctant to monkey-patch the core google libraries and have
to deal with that. So armed with the knowledge that our gems weren’t
calculating expiration date correctly, and we wanted them to know that
they needed to generate the access token, we tried modifying our
config files and found a fix that worked without having to patch
google’s gems.</p>
<h2 id="the-fix">The fix:</h2>
<p>Original adwords_api.yml, as generated from <code class="language-plaintext highlighter-rouge">setup_oauth2.rb</code>:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
<span class="s">:authentication:</span>
<span class="s">:method: OAuth2</span>
<span class="s">:oauth2_client_id: REDACTED</span>
<span class="s">:oauth2_client_secret: REDACTED</span>
<span class="s">:developer_token: REDACTED</span>
<span class="s">:client_customer_id: REDACTED</span>
<span class="s">:user_agent: WebKite_Radius</span>
<span class="s">:oauth2_token:</span>
<span class="s">:access_token: REDACTED</span>
<span class="s">:refresh_token: REDACTED</span>
<span class="s">:issued_at: 2018-08-23 13:10:13.299601000 -04:00</span>
<span class="s">:expires_in: </span><span class="m">3600</span>
<span class="s">:id_token:</span>
<span class="s">:service:</span>
<span class="s">:environment: PRODUCTION</span>
<span class="s">:connection:</span>
<span class="s">:enable_gzip: </span><span class="no">false</span>
<span class="s">:library:</span>
<span class="s">:log_level: INFO</span>
</code></pre></div></div>
<p>Note the pertinent information is the <code class="language-plaintext highlighter-rouge">:issued_at:</code> and <code class="language-plaintext highlighter-rouge">:expires_in:</code>
The fix is to switch <code class="language-plaintext highlighter-rouge">:expires_in:</code> to 0:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
<span class="s">:authentication:</span>
<span class="s">:method: OAuth2</span>
<span class="s">:oauth2_client_id: REDACTED</span>
<span class="s">:oauth2_client_secret: REDACTED</span>
<span class="s">:developer_token: REDACTED</span>
<span class="s">:client_customer_id: REDACTED</span>
<span class="s">:user_agent: WebKite_Radius</span>
<span class="s">:oauth2_token:</span>
<span class="s">:access_token: REDACTED</span>
<span class="s">:refresh_token: REDACTED</span>
<span class="s">:issued_at: 2018-08-23 13:10:13.299601000 -04:00</span>
<span class="s">:expires_in: </span><span class="m">0</span>
<span class="s">:id_token:</span>
<span class="s">:service:</span>
<span class="s">:environment: PRODUCTION</span>
<span class="s">:connection:</span>
<span class="s">:enable_gzip: </span><span class="no">false</span>
<span class="s">:library:</span>
<span class="s">:log_level: INFO</span>
</code></pre></div></div>
<p>After that, things worked perfectly! The gems assumed the current
access_token was already expired since it had a lifetime of 0, and it
forced generation of an up-to-date access key.</p>
<h2 id="let-me-know-if-this-helped-you-out">Let me know if this helped you out.</h2>
<p>This was a really unusual bug for us, and we were surprised that it
wasn’t all over StackOverflow or the Google forums since it completely
took our production services down and was extremely difficult to
resolve.</p>
<p>I’m still curious if we just have something really odd going on in our
local setup, or if this was a more widespread problem. So please shoot
me an email if you found this information helpful, or if you can shed
any additional light on the sudden change in our production
environment.</p>
<p>Thanks!</p>
<p>-Grant</p>
Universal History of Bitcoin Infamy2018-04-06T00:00:00+00:00http://www.grant-olson.net/news/2018/04/06/universal-history-of-bitcoin-infamy<h2 id="presented-at-the-crypto-for-the-community-conference-pittsburgh">Presented at the Crypto For The Community Conference, Pittsburgh,</h2>
<p>April 2018</p>
<p><a href="/uhobi/">Slides for presentation..</a></p>
Help! I fried my postgres install on homebrew!2017-05-08T00:00:00+00:00http://www.grant-olson.net/news/2017/05/08/i-fried-my-homebrew-postgres<p>Did you:</p>
<ul>
<li>Get an error about readline when running psql?</li>
<li>Quickly do a <code class="language-plaintext highlighter-rouge">brew upgrade postgres</code>?</li>
<li>Think everything seemed fine until you rebooted your computer?</li>
<li>At that point learn that postgres wasn’t running because of incompatible file versions?</li>
<li>Enter a world of hurt when you started reading up on the fact that an upgrade from postgres 9.3 to postgres 9.4 required a manual DB upgrade?</li>
<li>Experience shock to learn that you couldn’t even install postgres 9.3 after brew installed postgres 9.6?</li>
<li>Curl up in a fetal position, covered in cold sweat, wondering how the hell you’re going to have time to rebuild your complicated, er… you mean sophisticated, development environment from scratch when there’s so much work to be done?</li>
</ul>
<p>If so, I feel your pain. Hopefully I can help.</p>
<h2 id="fix-to-migrate-your-old-postgres-93-databases-to-96-in-homebrew">Fix to migrate your old postgres 9.3 databases to 9.6 in homebrew.</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mv /usr/local/var/postgres/ /usr/local/var/postgres.old
# Get old versions
brew tap petere/postgresql
brew install postgresql@9.3
brew install postgresql@9.6
# Install 9.6 db
initdb /usr/local/var/postgres/
# Stop the running 9.6 instance
sudo brew services stop postgres
# Migrate to new version
# may need to look in to /usr/local/Cellar to get exact directories
pg_upgrade -b /usr/local/Cellar/postgresql@9.3/9.3.16/bin/ -B /usr/local/Cellar/postgresql/9.6.2/bin/ -d /usr/local/var/postgres.old/ -D /usr/local/var/postgres
# Drink a coffee (or beer) or three or six while there's a migration
# Start up the server
brew services start postgres
# Verify server is running
psql
</code></pre></div></div>
<h2 id="but-now-it-blows-up-because-of-postgis">But now it blows up because of postgis!</h2>
<p>I had two old databases that used PostGIS. They caused the migration to fail. Attempts to get postgis to install on the 9.3 version of postgres failed. Unfortunately I don’t have a fix for that, but can tell you how to at least delete the offending databases if you don’t care about them, like I didn’t. If you do need the postgis enabled databases, the documentation for the tap indicates that you can use a utility called pex to install things, but I didn’t bother figuring that out.</p>
<p>To delete the old offending tables:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>brew unlink postgresql@9.6
brew link -f postgresql@9.3
# Need to start manually beacuse the files aren't where we expect
pg_ctl -D /usr/local/var/postgres.old/ -l /usr/local/var/postgres/server.log start
psql # Do DROP DATABASE etc
pg_ctl -D /usr/local/var/postgres.old/ stop
brew unlink postgresql@9.3
brew link -f postgresql@9.6
</code></pre></div></div>
<h2 id="a-little-annoyed-at-brew-right-now">A little annoyed at brew right now.</h2>
<p>Brew has given more than enough I can’t be too mad at it for too long, but I’m a little disappointed that:</p>
<ol>
<li>
<p>It silently upgraded readline which introduced a bunch of errors to old versions of software. I actually thought this blew up when I did an OS upgrade to OSX.</p>
</li>
<li>
<p>It makes no attempt to warn me that upgrading from 9.3 requires some serious manual intervention, the second time it’s silently updated a version of software to a version that’s incompatible with everything I have installed.</p>
</li>
<li>
<p>It allows no way out-of-the-box at this point for me to install the 9.3 binaries to do the upgrade from 9.3 to 9.4.</p>
</li>
<li>
<p>It seems to have broken old ways of installing old software by checking out an old commit of a particular brew file. Even though I tracked down the commit for 9.3, we now seem to autoupgrade and always install 9.6.2.</p>
</li>
<li>
<p>It’s inexplicable (to me) deprecations of vast swaths of homebrew commands. I’m sure the developers have their reasons and if I was on top of things it would make sense, but it’s frustrating to find four alternate solutions on StackOverflow that should magically fix your problem only to be told politely by brew, “Sorry, that command just doesn’t work anymore. Try again!”</p>
</li>
</ol>
<p>It would be nice if there was some sort of <code class="language-plaintext highlighter-rouge">--force</code> or <code class="language-plaintext highlighter-rouge">Are you sure?(Y/N)</code> prompt for these more disruptive upgrades, and wish the 9.3 version of postgres would have floated around a bit more so I could have fixed the problem without resorting to third-party taps.</p>
<h2 id="and-annoyed-at-myself">And annoyed at myself.</h2>
<ol>
<li>
<p>For not properly investigating the broken readline stuff I’d been dealing with off and on for a bit and ‘fixed’ by rebuilding my rubies in rvm.</p>
</li>
<li>
<p>Running brew commands nilly willy.</p>
</li>
<li>
<p>Not having a set up where it wasn’t a problem to blow away my dev dbs and start from scratch. I should have either had backups, or been able to work from clean databases without affecting my productivity.</p>
</li>
</ol>
<h2 id="and-thankful-to-peter-eisentraut">And thankful to <a href="http://peter.eisentraut.org/">Peter Eisentraut</a></h2>
<p>Who’s <a href="https://github.com/petere">homebrew tap</a> saved the day. Thanks Peter!</p>
<h2 id="and-four-hours-day-later-on-a-monday-afternoon">And four hours day later on a monday afternoon</h2>
<p>That twelve character bug fix worked! I’m off to my next coding adventure. Maybe now would be a good time to upgrade to Sierra. What’s the worst that could happen?</p>
<h2 id="update-2018-06-05---brew-does-it-again">Update 2018-06-05 - Brew does it again!</h2>
<p>I just tried installing pg_top, a utility that lets you view active
connections to your database. It should be a simple tool, but brew
decides to:</p>
<ol>
<li>
<p>Upgrade from 9.6.2 to 10.4 without saying anything!</p>
</li>
<li>
<p>Restart the service instead of leaving the old background one in
place!</p>
</li>
<li>
<p>Makes no attempt to migrate existing databases to the new version!</p>
</li>
</ol>
<p>Fortunately, this time brew was at least kind enough to keep the old
version around. I was able to fix it with:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>brew switch postgres 9.6.2
brew services postgres stop
brew services postgres start
</code></pre></div></div>
<p>Brew, please quit auto-upgrading services in a way that leaves things
in an inconsistent state. If I need to manually migrate between major
postgres versions, you shouldn’t just automatically update when I’m
installing a small utility that has postgres as a dependency.</p>
<h3 id="update-2019-01-09">Update 2019-01-09</h3>
<p>I just built some new OSX systems from scratch and had the opportunity
to think about swapping brew out for Mac Ports or other
solutions. I’ve been happy with brew 99.9% of the time, but still
occasionally encounter problems with background servers. I continued
to stick with brew for all my libraries, but used
<a href="https://postgresapp.com/">Postgres.app</a> for my database install. It
integrates extremely well with brew and the rest of my tool chain, and
gives me control over when and where to upgrade.</p>
Certificate Chains, Amazon EC2, and You!2015-03-19T00:00:00+00:00http://www.grant-olson.net/news/2015/03/19/certificate_chains_amazon_ec2_and_you<p>Are you getting https sec_error_unknown_issuer Error in Firefox? Did
you add https at your EC2 Load Balancer? Well then Amazon lied to you.</p>
<p>We just dealt with a really frustrating error over at WebKite. Our
site was suddenly broken with a bad ssl certificate, but only on
Firefox. To make matters more confusing, things worked fine on all our
Firefox installs in the office, but only blew up on clean installs.</p>
<p>If you’re reading this, and you’re seeing a <code class="language-plaintext highlighter-rouge">(Error code:
sec_error_unknown_issuer)</code> in Firefox, and you’re hosting your site on
EC2, then I can hopefully help you out.</p>
<p><a href="/files/Firefox_error.png">
<img src="/files/Firefox_error.png" width="600" />
</a></p>
<h2 id="tldr-amazon-lied-to-you-when-they-said-the-certificate-chain-was-optional">TLDR: Amazon Lied to You When They Said the Certificate Chain Was Optional</h2>
<p>Well I don’t know if I’d say they lied exactly, but at the time we set
up the new certificate in the EC2 dashboard, Amazon showed us this:</p>
<p><a href="/files/aws_console.png">
<img src="/files/aws_console.png" width="600" />
</a></p>
<p>We didn’t add the allegedly optional certificate chain. On some
installs of Firefox, we now get the above error. Creating a new
certificate for the load balancer that included a valid certificate
chain fixed the problem.</p>
<h2 id="what-is-the-certificate-chain">What is the certificate chain?</h2>
<p>I’ve said it before and I’ll say it again: <em>Crypto is easy,
authentication is hard.</em></p>
<p>It’s easy enough to encrypt browser connections so that nobody can
read the traffic going on between you and a server when you, for
example, send them your credit card number or social security number
or mothers maiden name. But you need some method of authenticating the
encryption keys so you know that they belong to the server you’re
talking to, and not a hacker or government agency trying to snoop on
you. That is: You need to trust the encryption keys you’re using to be
able to trust the safety of your encrypted communication, by
authenticating them as valid and from a trusted source. If you can’t
do that, the green lock on your browser means nothing.</p>
<p>https solves this problem with <strong>Certificate Authorities</strong>. These are
authorities who are trusted to <em>vouch</em> for other certificates by
signing them. A browser or other consumer of https decides which
authorities it trusts. This initial trust is written in
stone. Although some certifications are performed, from the
perspective of you the user sitting on your computer, these CAs are
trusted because Firefox says they’re trusted, and that’s all that you
need to know. In that sense that really makes your browser the
ultimate authority, and then it delegates that authority to the
various <strong>root certificates</strong> of Certificate Authorities that it trusts.</p>
<p>These root certificates then vouch for other sub-authorities. There
are several reasons to do this, but as a security concern it allows
you to compartmentalize damage if part of the system is
compromised. That brings us to the real-world analogy I like to use to
explain the system of trust I’ve already hinted at by calling it
<em>vouching</em>:</p>
<p><strong>Organized crime</strong>. A mob boss has lieutenants who work for him. These
lieutenants have their crews. These crews might have people working
for them. Each step along the way, introductions are made by vouching
for someone. You might tell your boss, “I know a guy. He’s a good
guy. We can trust him.” and your buddy joins the crew. The mob
boss doesn’t need to know about your buddy at all, but if the system
has integrity, starting with the mob boss and working the way down,
you’ve established a chain of trust that leads all the way down to you
from an undisputed authoritative source. Now if the system doesn’t
have integrity, and your boss is a rat, you all get killed, but the
rest of the system and trust is still in place. And the big boss
doesn’t need to know about the intimate details of things going on 3
levels underneath him to maintain the integrity of the system.</p>
<p>So back to the browser, it has several (actually hundreds) of
lieutenants who vouch for encryption keys. Someone else has vouched
for your encryption key. You’re just not important enough to get to
meet the big bad CAs themselves. An indeterminate number of layers
between the root certificate and your certificate create a chain of
trust that can be followed all the way from the big boss to little old
you. This is the <strong>certificate chain</strong>.</p>
<h2 id="why-didnt-this-break-on-company-computers">Why didn’t this break on company computers?</h2>
<p>Without that chain of trust, things should have blown up
everywhere. But they didn’t. They were working just fine on our
machines. This was particularly annoying because I use Firefox every
day, but it wasn’t broken on my machine. If it had been a Chrome or
Safari issue another coworker would have caught the error as well. But
we were all working blissfully unaware of the problem until I fired up
my backup laptop to do deploy a quick hotfix at home.</p>
<p>This bothered us enough that we decided to reproduce the error in
staging. We:</p>
<ol>
<li>Moved back to the old certificate.</li>
<li>Deleted the existing certificate store per <a href="https://support.mozilla.org/en-US/kb/couldnt-initialize-applications-security-component?redirectlocale=en-US&redirectslug=Could+not+initialize+the+browser+security+component#w_corrupted-file">this article</a>.</li>
</ol>
<p>Boom! Things were broken on previously working machines. So the site
would work for people using Firefox, but only if they had previously
accessed the site for the first time after we updated our
certificate. But that’s still ~11% of web users. Ugh!</p>
<p>I originally thought that a new version of Firefox had locked down SSL
security settings. But now I don’t believe that. I was making things
more complicated than they were. My current unproven working theory:</p>
<ol>
<li>Last year’s certificates on EC2 had the proper certificate chain.</li>
<li>When we accessed the site in the past, the certificate chain was
stored in Firefox.</li>
<li>When we requested new certificates from the same provider, it had
the same chain of signing certificates.</li>
<li>Our installations of Firefox were able to perform validation
because they had access to the pre-existing signing certificates in
the certificate db.</li>
<li>Installations of Firefox that had never visited the site before did
not, and complained.</li>
</ol>
<p>Back to the organized crime example: Someone tells you, “Hey this is
my buddy Vladimir. You guys should talk. I think you could do some
good for each other.” You reply, “Vladimir? We go way back. Remember
that thing? No the other thing. Yeah, yeah, good times!” There’s no
need to re-establish trust for someone you’ve already decided is
trustworthy.</p>
<h2 id="some-warning-signs-in-retrospect">Some warning signs in retrospect</h2>
<p>Here are some things that should have tipped us off to the problem in
advance. They seem like obvious warning signs after the fact, but you
know what they say about hindsight.</p>
<ol>
<li>
<p>The directions to get <a href="http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/SecureConnections.html#CNAMEsAndHTTPS">SSL set up on Cloudfront</a> for our static
assets in s3 provided a sequence of instructions that required a
certificate chain. It wouldn’t let us treat that component as optional.</p>
<p><strong>ProTip™:</strong> This provided us with a really quick fix on our end, as
the certificates we build for that were now in the drop-downs for
the EC2 load balancers, so we didn’t need to figure out how to
rebuild new certificates. If you did the same setup for Cloudfront
with the same certificate, just use that one in EC2.</p>
<p>In addition, it seems our static assets loaded just fine in
versions of firefox where the main site was breaking.</p>
</li>
<li>
<p>Someone experienced problems accessing the site on a phone. For
some reason the errors read like an expired certificate, so we
developed elaborate hypothesis’ about their cell phone network
caching old pages to save bandwidth. A bad certificate is a more
reasonable explanation.</p>
</li>
<li>
<p>We needed to manually install the certificates on our linux boxes
so our RPC calls wouldn’t fail. Our back end services requests
between boxes started failing after we got new certificates. We
identified this on staging and figured out how to manually install
the certificates, and then we no longer got ssl errors as the
boxes talked to each other.</p>
<p>The fact that reasonably up to date machines didn’t have the
certificates installed should have made us think more about why
they weren’t part of the default certificate store on Ubuntu. But
we wrote that off as openssl being really flaky and fragile when
you’re doing stuff from the command line, so we just added the
certs to prod boxes and went along our merry way.</p>
<p>In reality this was the same problem as described in the section
above. If we would have had the full certificate chain in our
certificate, then our RPC calls would have been able to provide
full authentication up to the root certificate that was already in
the system’s certificate store. But in this case the system won’t
cache intermediate certificates as it gets them, because you need
root access to store them in <code class="language-plaintext highlighter-rouge">/etc</code> and to run the
<code class="language-plaintext highlighter-rouge">update-ca-certificates</code> command that generates the system’s master
list.</p>
</li>
</ol>
<h2 id="why-didnt-the-other-browsers-complain">Why didn’t the other browsers complain?</h2>
<p>Good question. Firefox has been getting much more strict on these
sorts of validations and the process of vouching. In fact, not only
did Chrome refuse to complain here, it refused to complain when our
ssl certificates on staging expired unexpectedly! That’s right. The
certificates were expired and invalid and Chrome kept loading the site
without complaining.</p>
<p>As some of my previous blog posts and work have explained, I’m not a
huge proponent of the Certificate Authority model, but if you’re going
to do it you should do it right. If not, you might as well start
trusting self-signed certificates in the browser. A certificate is
only valid if you can validate all the certifications all the way up
to one that you trust, including things like expiration date and the
general authenticity of each signing certificate.</p>
<p>Back to the (growing tired and old) analogy, a stranger walks up to
you on the street and tells you his friend Eve is a great
safecracker. Why would you trust Eve? You wouldn’t. (Unknown chain.)
Or what about the same from your friend Pedro you haven’t seen in 15
years? Even though he’s never done you wrong, some time might cause
your trust to expire. (Expired signing certificate.)</p>
<h2 id="hope-that-helped">Hope that helped</h2>
<p>I’m a bit curious if our problem was unique, or if a lot of sites are
blowing up because they weren’t configured in a way that worked with
the newest versions of Firefox. If you encountered this problem, or a
similar one that wasn’t directly related to EC2, please shoot me an
email.</p>
<ul>
<li>Grant</li>
</ul>
<h2 id="addendum">Addendum</h2>
<p>A co-worker found <a href="http://www.nczonline.net/blog/2012/08/15/setting-up-ssl-on-an-amazon-elastic-load-balancer/">this article from 2012</a>, or almost 2 and 1/2 years
ago!</p>
<p>Pertinent lines:</p>
<blockquote>
<p>Don’t be fooled by the AWS dialog, the certificate chain isn’t really
optional when your ELB is talking directly to a browser. The
certificate chain is the part that verifies that fully verifies which
certificate authority issued the certificate and therefore whether or
not the browser can trust that the domain certificate is
valid. Different browsers handle things in different ways, but if you
are missing the certificate chain and firefox, you get a pretty scary
warning page.</p>
</blockquote>
<p>Doh!</p>
Reason 938 to Make Sure Your Test Fails Before It Passes2014-05-13T00:00:00+00:00http://www.grant-olson.net/news/2014/05/13/reason_938_to_make_sure_your_test_fails_before_it_passes<p>Here’s a quick example showing why you want to see your test fail
before you see it pass. This verifies that you’re actually testing
what you think you’re testing. This rspec test was passing just fine
before I realised I didn’t even test to see if the result was true:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>it "detects normal zip" do
Geomancer.zip_code_only?("15217").should
end
</code></pre></div></div>
<p>I only noticed it when I wrote the next test which also passed when
it should have failed.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>it "doesn't detect bad zip" do
Geomancer.zip_code_only?("123456").should
end
</code></pre></div></div>
<p>(<code class="language-plaintext highlighter-rouge">x.should</code> is perhaps even more enigmatic than <code class="language-plaintext highlighter-rouge">x.should be</code>, which is
actually valid and useful rspec syntax.)</p>
<p>I was honestly a little surprised that this didn’t fail with some sort
of runtime error, but rspec works in mysterious ways. I still haven’t
decided if this is a feature or a bug, but I think it would probably
be nice if this threw a runtime error. I can’t think of a case where
the above syntax would be useful.</p>
Did Julius Caeser Predict the World Would End in 3268 AD?2014-04-28T00:00:00+00:00http://www.grant-olson.net/news/2014/04/28/did-julius-caeser-predict-the-end-of-the-world<p>One of the nice things about dynamic languages like ruby is the
REPL. The Read-Evalueate-Print-Loop. Also known as the interactive
console. In ruby you fire it up with <code class="language-plaintext highlighter-rouge">irb</code>. Sometimes it’s easier to
fire this up to learn about the implementation than to actually <em>ugh</em> read
the documentation.</p>
<p>I was messing around with dates, and wanted to get an idea of how
dates were formatted:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>grant@john-icicleboy:~$ irb
2.1.1 :001 > require 'date'
=> true
2.1.1 :002 > puts Date.new
-4712-01-01
=> nil
</code></pre></div></div>
<p>I was really surprised to see that the date given with no arguments
provided was 4712 BC. Now I suspected that <code class="language-plaintext highlighter-rouge">Date.new</code> was really
shorthand for <code class="language-plaintext highlighter-rouge">Date.new(0)</code> and that this value was actually the epoch
for ruby’s Date class, similar to the way Unix uses an epoch of
1970-Jan-01 and stores dates as the number of seconds relative
this. (2 equals 2 seconds after Jan 1, 1970, etc.)</p>
<p>But why does ruby chose year -4712? That seems suspiciously as if ruby
assumes the world is only 6 or 7 thousand years old! Instead of using
this to troll people about creationism on twitter, I decided to dig in
and
<a href="http://www.ruby-doc.org/stdlib-2.1.1/libdoc/date/rdoc/Date.html#method-c-new">RTFM</a>.
This does indicate that this year was intentionally and specifically
chosen, and talks about various calendar systems throughout the ages,
but isn’t useful in answering the question at hand. What is so
important about -4712?</p>
<p>For this we have to turn to wikipedia. The article on the Gregorian
Calendar isn’t particularly useful. Neither is the one on the Julian
Calendar. But I add 4712 to my google searches, and I finally get to
the page I’m looking for. It’s about the <a href="https://en.wikipedia.org/wiki/Julian_day">Julian
Day</a>. It explains that the
Julian Day Number 0 is assigned to 4713 BC. It also goes on to
explain that the Julian Period has a interval of 7980 years.</p>
<p>The first thing an attentive reader will notice is that the Julian
Period begins in 4713 BC, and I’ve been spouting off about 4712 BC.
How could the ruby implementation know about all these details and
then get the year off by one? It didn’t. So why is it different?
Because there’s no year zero in the calendar system. We go from 1 BC
to 1 AD. However, we can specify a the number zero as an offset in
the Date class that ends up representing 1 BC. So we need to subtract
another year to represent these early dates, and year -4712 becomes
4713 BC.</p>
<p>Now think about all the hype about the Mayans predicting the end of
the world on December 21st, 2012? It was the same scenario. This
date was actually the date when the 5,126 year long calendar looped
around and started over. It wasn’t considered the end of time any
more than the end of one year and the start of the next. And yet a
bunch of people were still saying the Mayans thought the world was
going to end!</p>
<p>I think it’s interesting that the Julian Period also has an end. The
period ends in 3268 AD. That’s over a millennium away from today’s
date. By then we could be using some new calendar system. Star
Date. Metric Time. Who knows? Today’s religions could seem ancient
and silly. The Roman Empire itself could seem as distant culturally
as the Mayan Empire does to us now. Will someone stumble upon these
old articles about the Julian Period. Will they interpret the end of
the cycle as the end of the world? Will the headlines read:</p>
<blockquote>
<p>Julius Caeser Predicted It! The End Is Near!</p>
</blockquote>
<p>?</p>
<p>We shall see.</p>
Upstart Configuration for God2014-03-20T00:00:00+00:00http://www.grant-olson.net/news/2014/03/20/upstart_config_for_god<p>I thought I’d follow up my <a href="/news/2014/03/16/process_management_virtualization_religion_and_god.html">completely impractical post on
god</a>
with a practical one. I needed to write an upstart script for god and couldn’t find any examples out there. Here’s what I ended up doing.</p>
<h2 id="full-script">Full Script</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># /etc/init/god.conf
start on runlevel [2345]
stop on runlevel [06]
setuid webkite
setgid webkite
respawn
respawn limit 10 60
env HOME=/home/webkite
exec bash -l -c 'cd /opt/node/apps/god && exec bundle exec god -c my.god.rb -l /opt/node/log/god.log -P /opt/node/pids/god.pid -D'
</code></pre></div></div>
<h2 id="the-breakdown">The Breakdown</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>start on runlevel [2345]
stop on runlevel [06]
</code></pre></div></div>
<p>Run all the time, unless we go into single-user mode or shut down.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>setuid webkite
setgid webkite
</code></pre></div></div>
<p>Run as an unprivileged user. Don’t run as root.</p>
<p>On the negative side: We can’t take advantage of the event driven
conditions in god, such as ‘kill process if memory exceeds a half a
gig’.</p>
<p>On the positive side: We don’t run as root. We don’t need a system rvm.
And we don’t need to run rvm as root.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>respawn
respawn limit 10 60
</code></pre></div></div>
<p>Have upstart respawn the process if it dies unexpectedly, but don’t
let it go into death throes and overwhelm the server if it’s just
plain broken.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>env HOME=/home/webkite
</code></pre></div></div>
<p>We end up running a bash login shell to load rvm functions, but even
that assumes that you have a decent <code class="language-plaintext highlighter-rouge">$HOME</code> variable. We don’t
without this.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>exec bash -l -c 'cd /opt/node/apps/god && exec bundle exec god -c my.god.rb -l /opt/node/log/god.log -P /opt/node/pids/god.pid -D'
</code></pre></div></div>
<p>Actually start god. This was the really tricky part.</p>
<ol>
<li>
<p>We need <code class="language-plaintext highlighter-rouge">bash -l -c</code> so rvm works. <code class="language-plaintext highlighter-rouge">rvm use</code> won’t work in sh.</p>
</li>
<li>
<p>Upstart uses magic to track the process id. If you fork or
daemonize, this changes. Upstart provides the two options <code class="language-plaintext highlighter-rouge">expect
fork</code> and <code class="language-plaintext highlighter-rouge">expect daemonize</code> which works in most cases. Or so I’m
told. But we still lost the proper process id with god for
unknown reasons. So we needed to:</p>
<ul>
<li>
<p>Use <code class="language-plaintext highlighter-rouge">exec</code> so bash doesn’t start its own process.</p>
</li>
<li>
<p>Specify <code class="language-plaintext highlighter-rouge">-D</code> (no-daemonize) even though we are daemonized,
so that god doesn’t fork on its own and upstart gets the
correct process id.</p>
</li>
</ul>
</li>
</ol>
<p>Hope this helps someone.</p>
Process Management, Virtualization, Religion and God2014-03-16T00:00:00+00:00http://www.grant-olson.net/news/2014/03/16/process_management_virtualization_religion_and_god<p>Sometimes I think about things a little too much.</p>
<p>In this particular case, I was configuring <a href="http://godrb.com/">god</a> to
watch the components of our <a href="http://webkite.com/beta/">new software
stack</a> at WebKite. God is process
management software that lets you start, stop, and restart programs.
Most importantly it will automatically restart a dead process so I
don’t get paged at 3:17 am on a Sunday.</p>
<p>Software developers are known for coming up with overly clever names
for their creations, and god is no exception. It sits there watching
over the world, bringing its children to life, keeping a benevolent
eye on them, sometimes killing them dead in their tracks, and
sometimes resurrecting and healing them when misfortune arises.
Pretty clever, right?<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup></p>
<p>technical information and Richard Dawkins keeps showing up in your
results.</p>
<p>I had god set up and working, but there was just one problem. God
cannot monitor and restart himself when he dies due to some unforeseen
misfortune. For example, a server reboot. For this, I needed to
configure <a href="http://upstart.ubuntu.com/">upstart</a> since we’re running
<a href="http://upstart.ubuntu.com/">ubuntu</a>. This turned out to be
surprisingly time consuming (ever use rvm and bundler on a server?),
but like a lot of dev ops stuff it wasn’t particularly intellectually
challenging. Tweak one setting, reboot the server, see if it works.
Repeat 50 times.</p>
<p>And this is where my mind started to wander. God only thinks he’s all
powerful and almighty. But he’s not. He’s just another program.
Maybe a little more powerful than most, still just a userland program.
He’s not the One True Transcendent God, creator of all, timeless,
formless, boundless. He’s the <a href="https://en.wikipedia.org/wiki/Demiurge#Yaldabaoth">demiurge</a>!</p>
<p>I imagine at this point most readers are unimpressed and asking, “What
the hell is the demiurge?” The demiurge is a deity in Gnostic
cosmology. Okay, that probably doesn’t help unless I explain what the
Gnostics.</p>
<p>There were a wide variety of Christian sects which had drastically
different beliefs between Christ’s death and the time some 300 years
later that Christianity was established as the official religion of
the Roman Empire and the First Council of Nicaea established proper
Christian orthodoxy. They all wrote gospels to spread the Word as
they interpreted it. There were hundreds of Gospels. The ones we
settled on (Mathew, Mark, Luke, and John) and put in the bible weren’t
written by anyone who knew Christ directly. They were written
somewhere between 40 and 150 years after Christ walked the earth.<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup></p>
<p>Jesus, they do so word for word. This leads some scholars to think
there is a mythical Q source, a single document that the following
gospels used as their source. So even after you account for the fact
that you’re not reading them in their original language, the quotes
from Jesus are a reflection of a reflection of what he might have
said.</p>
<p>There’s an open problem in Judeo-Christian belief systems. In simple
terms: Why do good things happen to bad people? More explicitly: If
the universe was created by a benevolent, loving, supreme and perfect
being, how can it possibly contain <strong>any</strong> imperfection?</p>
<p>The various Gnostic sects found an interesting solution to this
problem: It wasn’t! The universe was created by a flawed deity who
created the material world. He only thinks he’s the supreme being.
He was an aborted creature, left for dead, who managed to survive, and
being alone assumed he was the highest power in all of creation. He
then went on to create the physical universe which inherited his
flaws. This flawed deity is called the demiurge.</p>
<p>This provides an interesting explanation to the dichotomy of the
vengeful god that exists in the Old Testament (flooding the world,
destroying Sodom and Gomorrah, punishing and vindictive) and the
loving god that Christ preaches about. Christ is teaching us about
the real transcendent god who exists outside of this realm. He is
trying to teach us how to unlock the divine spark that lives inside
us, through <strong>gnosis</strong> or <strong>knowledge</strong>, so that we too can transcend
the cage that is the imperfect physical universe created by the
imperfect demiurge, and reunite the spark with the genuine supreme
being.</p>
<p>And this gets us back to the god running our EC2 server. He sits
there thinking he’s running the show, that he’s in charge, but in
reality he’s a prisoner within a virtual machine that exists within
one of thousands of physical servers exist within a server room within
the world. He’s completely unaware and oblivious of. He sits there
managing these lesser processes, making sure they are cared for, even
if they do contain bugs and other imperfections. Yet he can’t remove
the imperfections. He can’t fix them. He can only keep the
applications going.</p>
<p>If this process is the demiurge in this scenario, what is the real
supreme being? I don’t know. But what I do know is that I had a few
terminal windows open. I rebooted the servers not by running
<code class="language-plaintext highlighter-rouge">/sbin/shutdown</code>, but by rebooting them in Amazon’s AWS web console.
And lo and behold, the following message appeared on the terminals
before they shut themselves down:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Someone pressed Control-Alt-Delete.
System rebooting now!
</code></pre></div></div>
<p>What could reach into the virtual machine and push a
Control-Alt-Delete button on a keyboard that never existed? A more
powerful monitoring process? One that lives in a heavenly place known
only as <strong>the cloud</strong>? One that was able to reach into a server room,
into a physical server, into an imaginary virtual server and touch an
untouchable keyboard to initiate a system reboot? One that has no
earthly name?</p>
<p>The sad thing is that this supreme process probably thinks he’s
omnipotent as he smites the universe that god has been happily
monitoring with an unexpected reboot. At least until <a href="https://aws.amazon.com/message/67457/">lightning bolts
thrown down from the sky</a> smite
this seemingly supreme process as well.</p>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Well at least until you try to search for <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>In fact, one interesting thing to note is that when the Gospels quote <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
What Does Yahoo BOSS Really Think of Cleveland?2014-02-08T00:00:00+00:00http://www.grant-olson.net/news/2014/02/08/what-does-yahoo-boss-really-think-of-cleveland<p>We had an interesting problem at work. Any time a user did a
location-based search, we returned the same result set: Lakewood,
Ohio. It didn’t matter what the user searched by. “Beverly Hills
90210” returned Lakewood, Ohio. “221B Baker Street” returned
Lakewood, Ohio. Lakewood is part of the Cleveland Metropolitan Area,
and as such we’ll just refer to it as “Cleveland” for the purposes of
this essay.</p>
<p>My initial thought: Something is screwed up with caching. Somehow
someone did a search on Ohio, and the results somehow got stuck in one
of our many caching layers. Subsequent requests kept returning the
same cached result. We do have a couple of employees from Ohio, so it
wouldn’t be entirely outlandish for a Cleveland-area search the be the
first search performed after a caching bug was introduced.</p>
<p>After digging though the code path, it turned out that Yahoo BOSS, our
geolocation provider, was returning the same results no matter what
location we submitted. Per the documentation, you perform a search
by submitting a request formatted as either
<code class="language-plaintext highlighter-rouge">placefinder?location=<address></code> or <code class="language-plaintext highlighter-rouge">placefinder?q=<address></code>. Our
code was using the <code class="language-plaintext highlighter-rouge">location</code> parameter. I tried changing this to the
<code class="language-plaintext highlighter-rouge">q</code> parameter, and suddenly geolocation requests magically worked.</p>
<p>I made a quick fix to the code, and did something you should never do:
I pushed out an untested release directly to production on 6 PM on a
Friday. Our test and staging environments don’t do real geolocation
since each request costs money. They simply extract the zip code and
look that up. It would take a while to get a test environment with
real geolocation setup, and changing the word <code class="language-plaintext highlighter-rouge">location</code> to <code class="language-plaintext highlighter-rouge">q</code> in the
bowels of some obscure code didn’t seem likely to bring down the
entire site. At worst, location searches would just still be broken.
This wouldn’t be any worse than constantly returning Cleveland.
What’s the worst that could happen? So I pushed the code.</p>
<p>And…</p>
<p>Of course…</p>
<p>It worked! Problem solved. I entered the weekend breathing a deep sigh
of relief. But then I found myself wondering: Why Cleveland? Why, of
all the places on Earth, would Yahoo BOSS use Cleveland as the default
location for all invalid queries? There are three options I can think
of:</p>
<ol>
<li>
<p>Yahoo BOSS considers Cleveland to be the center of the universe.
When you don’t provide an appropriate geolocate-able address, it
decides it only makes sense to return the center of the universe, the
most important place on earth.</p>
</li>
<li>
<p>Yahoo BOSS somehow knows that Cleveland is some sort of nexus point
between dimensions, a.k.a. the Hellmouth.</p>
</li>
<li>
<p>Yahoo BOSS, after stripping invalid parameters, is left with the
impossible job of geolocating <em>the null address</em>. When provided a set
of null inputs, it must return the one area on Earth that most ideally
represents pure and unending emptiness, the absence of anything and
everything, the state of total nothingness. And in its infinite
wisdom, it returns Cleveland.</p>
</li>
</ol>
<p>There are simply no other options. Being a Pittsburgher, I have my own
opinions on which of the three options is unlikely, which is possible,
and which is probable. But I’ll let you decide for yourself.</p>
<p>(As Yakov Smirnoff once said, “In every country, they make fun of
city. In U.S. you make fun of Cleveland. In Russia, we make fun of
Cleveland.”)</p>
Using Your IronKey on 64-bit Ubuntu 13.102014-02-01T00:00:00+00:00http://www.grant-olson.net/news/2014/02/01/using-your-ironkey-on-64-bit-ubuntu-13-10<p>The linux IronKey executable is 32 bit, so you can’t unlock the
IronKey on 64 bit linux. The traditional fix is to install
<code class="language-plaintext highlighter-rouge">ia32-libs</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo dpkg --add-architecture i386
sudo apt-get update
sudo apt-get install ia32-libs
</code></pre></div></div>
<p>This no longer works. The internet tells us that this meta-package
was removed so that you don’t install a bunch of garbage, and you
should just install the specific packages needed for your app.</p>
<p>Unfortunately there’s no way to tell what packages are required. The
internet says you should just run <code class="language-plaintext highlighter-rouge">ldd filename</code>. This produces an
error on the ironkey executable.</p>
<p>After some trial and error, I found out you need the 32-bit gcc
libraries:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt-get install lib32gcc1
</code></pre></div></div>
<p>After this, I was able to mount my IronKey.</p>
<p>I don’t <em>think</em> that I needed to add the i386 architecture for this
command to work, but I’m not about to re-install my OS to test that.
So if the above command doesn’t find a package, perhaps you should try
adding the i386 archtecture and updating apt.</p>
<p>Hope this helps another poor soul out there,</p>
<p>Grant</p>
Mounting Encrypted lvm Volumes From a Live CD2014-02-01T00:00:00+00:00http://www.grant-olson.net/news/2014/02/01/mounting-encrypted-lvm-volumes-from-a-live-cd<p>I once again fried my system and had to recover some files from an
encrypted filesystem via a Live CD. It’s a frustrating experience. I
tried both Debian Wheezy and Ubuntu 13.10.</p>
<p>First, after you boot it will show an encrypted filesystem icon. If
you click on that the system will prompt for your password. After you
enter your password, it will complain that the partition is invalid
and can’t be mounted.</p>
<p>I’ve seen this before. In the past I would simply run <code class="language-plaintext highlighter-rouge">sudo apt-get
install lvm2</code> and the volumes would appear.</p>
<p>This time around, I was having no such luck. After some surfing, I
found this <a href="http://ubuntuforums.org/showthread.php?t=940904">Ubuntu Forums
thread</a>.</p>
<p>It basically worked. Here’s the exact sequence I followed on my
system:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt-get install cryptsetup
sudo modprobe dm-crypt
sudo apt-get install lvm2
sudo modprobe dm-mod
sudo vgscan # this outputs the volume name
sudo vgchange -a y volume_name_from_above
</code></pre></div></div>
<p>After the last step, my encrypted volumes magically appeared on my
desktop.</p>
<p>Hope this helps some other poor soul,</p>
<p>Grant</p>
Nobody Cares About Signed Gems2013-09-29T00:00:00+00:00http://www.grant-olson.net/news/2013/09/29/nobody-cares-about-signed-gems<blockquote>
<p>This postmortem originally appeared on the rubygems-openpgp-ca.org
site that contained a proof-of-concept system to crytographically
sign ruby software packages so they could be authenticated on
download. The system was developed after rubygems.org, the primary
distribution network was hacked and compromised gems were uploaded.</p>
<p>This was by far the most toxic experience I’ve ever had trying
to write Open Source software, but here I just tried to focus on how
uninterested developers are in software authentication, and how quickly
a funnel of tens of thousands of programmers visiting the site resulted
in only a half dozen or so actual test drives of the software.</p>
</blockquote>
<h1 id="nobody-cares-about-signed-gems">Nobody Cares About Signed Gems</h1>
<p>The signing key for the CA expired on August 17, 2013. As an
experiment, I decided to leave the key in an expired state and see if
and when anyone would notice or complain. Today (Sept 29, 2013)
someone finally asked about it on the mailing list. Just over 45
days.</p>
<h2 id="why-would-i-let-the-key-expire">Why would I let the key expire?</h2>
<p>I originally wrote rubygems-openpgp a few years ago because I wasn’t
happy with the existing signing solution and was looking for a side
project to work on. No one paid attention. It was clear that I was
the sole user. So the project sat there in maintenance mode.</p>
<p>Then… A few years later… rubygems.org got hacked! There was no way
to tell if the gems on the site had been compromised. Suddenly there
was interest in signing gems. A few people found my project and
submitted pull requests. Now that there were a few users, I dived
back in and decided to take the gem from the proof-of-concept stage to
a stable piece of software. I spent the next month doing so.</p>
<p>Along the way one-too-many people said that OpenPGP wasn’t useful
because most end users couldn’t get into the strong set in the
Web-of-Trust, ignoring the fact that distributions systems such as
<code class="language-plaintext highlighter-rouge">apt</code> silently use OpenPGP behind the scenes. So I created this site
as a proof-of-concept CA. The implementation was simple. I didn’t
even really need a rails website. The content is exclusively jekyll.
Basically a user would sign up. I would manually verify that they had
control of their email, signing key, and had published gems on
rubygems.org. I would then manually sign off on their key with a
smart-card set aside for that purpose. I got a MVP up and running,
and started circulating the link.</p>
<p>This brought tens of thousands of users to the site. Plenty of
upvotes on reddit. Success, right?</p>
<h2 id="wrong">Wrong!</h2>
<p>One metric to measure success would be the number of people who
requested certification. That number was less than a dozen.</p>
<p>But we need to keep in mind these are just gem authors, right? Those
should be orders of magnitude rarer than the actual users, right?</p>
<h2 id="wrong-1">Wrong!</h2>
<p>I provide a test gem called openpgp-signed-hola. It is the standard
“Hello World” gem with the addition of a digital signature. All the
documentation referred users to use this gem to see how things work.
Rubygems.org has nice charts that show how many times a version of gem
has been downloaded. Of course this number includes bots and other
automated retrievals in addition to actual human users testing out
rubygems-openpgp. But it does provide an upper bound of the maximum
amount of people who tried to verify the test gem.</p>
<p>No more than two dozen people tried to manually verify the test gem.
To be honest, I think this number is probably high. I think the
number was much lower.</p>
<p>Honestly, I found this disappointing. It takes less than 5 minutes to
test gem verification. You would think that number would be at least
equal to the number of upvotes on reddit. That people would actually
read the site and try things out, instead of hitting the upvote button
and going away. You would think the people writing blog posts about
how important signing was would take 5 minutes to try out the
software. But alas, they didn’t.</p>
<h2 id="but-it-takes-time-for-software-adoption-right">But it takes time for software adoption, right?</h2>
<p>I had a few interested users. There were finally signed gems on
rubygems.org signed by people other than me. I expected that would be
enough that I’d get a trickle of signups over the course of the next
year. But after the initial burst of interest activity came to a
halt. After several months without receiving a single sign-up on the
site, inquiries on the mailing lists, or issues in github, I found
myself wondering why I was paying $20 a month to heroku to for https
hosting. I went ahead and canceled that. And that’s when I decided
to let the signing key expire.</p>
<h2 id="why-would-i-let-the-key-expire-again">Why would I let the key expire again?</h2>
<p>The key was setup to expire every 30 days. This was basically a way
to enforce a revocation policy. If the key itself was compromised
(unlikely, it’s on a smart card), or if I was forced to issue
revocations on the CA’s behalf, a periodic expiration would force
users to retrieve updated certificates and hence any revocations.</p>
<p>If there was a small community of people who were using the CA keys, I
would quickly get an email they started noticing that all their
software was expired. It would at least provide some indication that
I should continue to maintain the CA.</p>
<p>45 days later someone finally noticed.</p>
<h2 id="that-doesnt-prove-nobody-cares-about-signed-gems-it-just-proves-nobody-cares-about-rubygems-openpgp">That doesn’t prove nobody cares about signed gems, it just proves nobody cares about rubygems-openpgp</h2>
<p>True.</p>
<p>But I haven’t seen any activity on the X.509 front either. After the
rubygems compromise, things were supposed to change. That was finally
the kick-in-the-pants the community needed to fix things and take gem authentication seriously.</p>
<ul>
<li>
<p>The rubygems-trust project was started to setup replacement rubygems
with CA capabilities. Activity fizzled out after a month with no
visible results.</p>
</li>
<li>
<p>A few people tried to start signing their gems with X509, but most
gave up because it was impractical.</p>
</li>
<li>
<p>The X509 code in rubygems itself has essentially the same TODO list
as it did when the code was initially merged in 2007.</p>
</li>
</ul>
<p>The above points, as the rest of this essay, is NOT an attempt to call
anyone out, it’s simply what I’ve observed. Getting X509 signing and
verification of gems to actually be used isn’t any farther along than
it was before the rubygems.org hack either.</p>
<h2 id="in-conclusion">In Conclusion</h2>
<p>I’m primarily documenting my experiences with the project so they’re
available if/when there is push to start signing gems in the future.</p>
<p>This post is negative, but I hope it doesn’t come across as bitter. I
don’t regret any of the time I spent on rubygems-openpgp or the CA.
It was fun! And I’ll continue to maintain rubygems-openpgp if it’s
needed. (The CA, on the other hand, will probably go away when the
domain expires and/or I want to use my free heroku hours for another
project.)</p>
<p>I do wish people were more interested in signing their gems one way or
another, but then again I wish more people (especially techies) would
encrypt their damn emails! Instead they’ll write blog posts and tweet
about the importance of doing so, but won’t actually change their
habits.</p>
<p>-Grant</p>
It Begins...2013-08-27T00:00:00+00:00http://www.grant-olson.net/2013/08/27/it-begins<p>I decided to move my main site away from Google Apps. I don’t think I
plan to blog regularly, but Jekyll is quick and convenient, and the
Lagom theme looks nice. Let the migration begin.</p>
Setting up an OpenPGP smartcard and IronKey on Debian Wheezy2013-06-16T00:00:00+00:00http://www.grant-olson.net/news/2013/06/16/openpgp-smartcard-and-ironkey-on-debian-wheezy<p>My computer just died. I threw the hard drive into another computer.
Everything looked good until it tried to fire up X and then I just got
a blank screen. You know what that means. Time to reinstall the OS.
There were a few gotchas that I thought I’d document here.</p>
<h2 id="live-cd-doesnt-mount-encrypted-partitions">Live CD doesn’t mount encrypted partitions</h2>
<p>I run full disk encryption. After my computer died I wanted to grab a
few files and backup the most before re-installing the os. I grabbed
the Debian Live DVD image with xfce.</p>
<p>Everything booted. I clicked on my encrypted partition. I was
prompted for a password. The password was accepted. But then the GUI
complained that it couldn’t mount the filesystem.</p>
<p>After some trial-and-error, I learned that I needed to install lvm2:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt-get install lvm2
</code></pre></div></div>
<p>Then I was able to access my encrypted partitions and get the
backup files that I needed.</p>
<h2 id="openpgp-smartcard">OpenPGP smartcard</h2>
<p>After that I reinstalled the OS and all my favorite packages. Gnupg2,
enigmail, thunderbird, keepassx, etc. But after that my smartcard
wouldn’t work. I run into this problem every time I reinstall my OS!</p>
<p>But after installing gnupg2, I still couldn’t use the smartcard.
This happens to me every time I reinstall Debian. One long-standing
issue is that scdaemon, the driver for the smartcard isn’t installed
unless you install the gpgsm package:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt-get install gpgsm
</code></pre></div></div>
<p>I’ve done that before. But I still couldn’t use the card unless I was
root. I also needed to install lib-ccid and pcscd:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt-get install lib-ccid pcscd
</code></pre></div></div>
<p>After that I was good.</p>
<p>On this install I’m just running xfce. In the past I’ve had problems
with gnome taking over the smart card. See my previous post on <a href="/news/2013/03/09/using-openpgp-smartcard-on-ubuntu-12-10.html">Using
an OpenPGP Smartcard on Ubuntu
12.10</a>
if you’re still having problems.</p>
<h2 id="getting-ironkey-working">Getting IronKey working</h2>
<p>I also have an IronKey, which is a handy USB drive that has hardware
encryption and (like an OpenPGP smartcard) will self-destruct if
someone tries to brute force it.</p>
<p>Normally I just use the software included on the drive to mount the
partition. But lately I’ve run into problems where the program
cryptically doesn’t run. This is because the software is 32 bit and
I’m running a 64 bit install.</p>
<p>You’ll need to enable multi-architecture installs for 32 bit software and
install the 32 bit software to get the IronKey working:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo dpkg --add-architecture i386
sudo apt-get update
sudo apt-get install ia32-libs
</code></pre></div></div>
<p>After that the provided software should work.</p>
How to Break Rails 3.2 But Not 3.0 on Linux But Not OSX2013-03-12T00:00:00+00:00http://www.grant-olson.net/news/2013/03/12/how_to_break_rails_32_but_not_rails_30_on_linux_but_not_osx<p>Over at <a href="http://webkite.com/">WebKite</a>, we finally got around to
updating our rails stack to 3.2 last week. This was of course long
overdue. But it’s one of those things that takes a non-zero amount of
time, and doesn’t provide any immediately visible features, so we’ve
been pushing it back.</p>
<p>Testing went fine. Everything looked good. The code got merged to
mainline. And our CI server decided it couldn’t run tests anymore.
I’ll spare you the full traceback, but the basic error was:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cannot load such file -- multi_json/engines/Yajl
</code></pre></div></div>
<p>Soon after that, a developer pushed a feature branch to our staging
server so team members could review it. Or tried to. That server
blew up with the same error.</p>
<p>We found ourselves facing a problem that would only manifest itself on
Linux boxes, but none of the developer’s MacBooks. And one of those
nasty ones that doesn’t include any of our application code in the
traceback.</p>
<p>And when a search on an error message doesn’t even give a single
stackoverflow page, you know you’re in for a long day.</p>
<h2 id="multi_json-looks-suspicious">multi_json Looks Suspicious</h2>
<p>The traceback did involve the multi_json gem, which dispatches your
json calls to whatever library you want to use. A little research on
that shows that it prefers the oj gem over the yajl-ruby gem.</p>
<p>I check to see if we’ve explicitly specified yajl-ruby in the Gemfile,
or if it’s just a dependency of another gem. We actually have
explicitly chosen that gem version, which seems a little suspicious.</p>
<p>I go to the commit for that. Sure enough, this was done after we were
forced to upgrade to 3.0.20 for a security issue.</p>
<h2 id="upgrading-to-3020-broke-our-app">Upgrading to 3.0.20 Broke Our App</h2>
<p>The initial upgrade from 3.0.19 to 3.0.20 broke our test suite. I
didn’t think that minor-minor releases were supposed to do that. It
turns out that to deal with some yaml exploits, the fix swapped out
the JSON back end to one that didn’t work with our app.</p>
<p>The replacement back end didn’t properly serialize and then
deserialize a string, as demonstrated by the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1.9.3p392 :001 > ActiveSupport::JSON.decode(ActiveSupport::JSON.encode("foo"))
ActiveSupport::OkJson::Error: unexpected "foo"
from /Users/grant/.rvm/gems/ruby-1.9.3-p392@webkite/gems/activesupport-3.0.20/lib/active_support/json/backends/okjson.rb:69:in `textparse'
from /Users/grant/.rvm/gems/ruby-1.9.3-p392@webkite/gems/activesupport-3.0.20/lib/active_support/json/backends/okjson.rb:47:in `decode'
...
</code></pre></div></div>
<p>Apparently there’s some disagreement about whether a bare string is
valid json. It seems clear to me that it is when looking at
<a href="http://json.org/">json.org</a>. And if it’s not, then shouldn’t
<code class="language-plaintext highlighter-rouge">ActiveSupport::JSON.encode</code> throw an error? But I digress…</p>
<p>The short story is that only
<a href="https://github.com/brianmario/yajl-ruby">yajl-ruby</a> parsed json in a
way that was compatible with our app. So we added the following line
of code to our initialization:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ActiveSupport::JSON.backend = "Yajl"
</code></pre></div></div>
<p>And went on our merry way. Now that seems to be causing problems
on rails 3.2.</p>
<h2 id="but-clearly-other-people-are-using-yaji-ruby">But Clearly Other People Are Using yaji-ruby</h2>
<p>This is a widely used gem. And no one is reporting errors. So what’s
up?</p>
<p>Well first I need to talk to the developer who did the original fix.
He refreshes my memory on why we needed to use a specific json parser
to begin with. Then he notes that the capital Y in
<code class="language-plaintext highlighter-rouge">ActiveSupport::JSON.backend = "Yajl"</code> looks suspicious.</p>
<h2 id="osx-sorta-kinda-has-a-case-insensitive-filesystem">OSX Sorta Kinda Has a Case-Insensitive Filesystem</h2>
<p>If you want to drive yourself crazy, start using caps for directories
and file names:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>johnmudhead:pikimal grant$ pwd
/Users/grant/src/pikimal
johnmudhead:pikimal grant$ cd /USERS/GRANT/SRC/PIKIMAL
johnmudhead:PIKIMAL grant$ cd ..
johnmudhead:SRC grant$ cd ..
johnmudhead:GRANT grant$ cd ..
johnmudhead:USERS grant$ cd ..
johnmudhead:/ grant$
</code></pre></div></div>
<p>If you want to drive rvm crazy, mix it up a bit:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>johnmudhead:pikimal grant$ pwd
/Users/grant/src/pikimal
johnmudhead:pikimal grant$ cd ../Pikimal
==============================================================================
= NOTICE =
==============================================================================
= RVM has encountered a new or modified .rvmrc file in the current directory =
= This is a shell script and therefore may contain any shell commands. =
= =
= Examine the contents of this file carefully to be sure the contents are =
= safe before trusting it! ( Choose v[iew] below to view the contents ) =
==============================================================================
Do you wish to trust this .rvmrc file? (/Users/grant/src/Pikimal/.rvmrc)
y[es], n[o], v[iew], c[ancel]> y
johnmudhead:Pikimal grant$ cd ../PiKiMaL
==============================================================================
= NOTICE =
==============================================================================
= RVM has encountered a new or modified .rvmrc file in the current directory =
= This is a shell script and therefore may contain any shell commands. =
= =
= Examine the contents of this file carefully to be sure the contents are =
= safe before trusting it! ( Choose v[iew] below to view the contents ) =
==============================================================================
Do you wish to trust this .rvmrc file? (/Users/grant/src/PiKiMaL/.rvmrc)
y[es], n[o], v[iew], c[ancel]>
</code></pre></div></div>
<h2 id="and-that-was-the-problem">And that was the problem!</h2>
<p>OSX could require ‘Yaml’ because it doesn’t think it’s any different than ‘yaml’. However, linux thinks they’re totally different names. A one letter fix magically restored all of our linux boxes to good health:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>johnmudhead:pikimal grant$ git log -p 0464fd2acc8d4c38f212dc8376dd0e80795b1cc5
commit 0464fd2acc8d4c38f212dc8376dd0e80795b1cc5
Author: Grant Olson <grant@pikimal.com>
Date: Mon Mar 11 13:01:14 2013 -0400
Don't break rails on 3.2 but not 3.0 and linux but not OSX
diff --git a/config/initializers/yajl_as_json_backend.rb b/config/initializers/yajl_as_json_backend.rb
index 3f75b9d..b204999 100644
--- a/config/initializers/yajl_as_json_backend.rb
+++ b/config/initializers/yajl_as_json_backend.rb
@@ -5,4 +5,4 @@
#
# See http://weblog.rubyonrails.org/2013/1/28/Rails-3-0-20-and-2-3-16-have-been-released/
# for details as to why the JSON backend was changed.
-ActiveSupport::JSON.backend = "Yajl"
+ActiveSupport::JSON.backend = "yajl"
</code></pre></div></div>
<p>Now the only question I’m left with: Why did this work correctly on
rails 3.0? If you have any ideas I’d love to hear them.</p>
Using an OpenPGP Smartcard on Ubuntu 12.102013-03-09T00:00:00+00:00http://www.grant-olson.net/news/2013/03/09/using-openpgp-smartcard-on-ubuntu-12-10<p>I’m currently adding a key continuity feature to rubygems-openpgp. It
works similar to the way that ssh stores copies of known host keys,
and warns you if the key has changed.</p>
<p>This is the first time I’m trying to store any changes locally, and
was a bit worried about the directories being created properly on
Windows. So I decided to setup a VirtualBox install of Windows 8. My
current hard drive was out of space, so that gave me an excuse to buy
a nice new SSD drive. And that led to installing the latest version
of Ubuntu. And now my Saturday is almost gone.</p>
<p>I had a little trouble getting my OpenPGP smartcard setup, so I
thought I’d write about it here.</p>
<h2 id="problem-1---scdaemon-is-in-the-wrong-package">Problem 1 - scdaemon is in the Wrong Package</h2>
<p>This is actually a problem on the Debian packages that has existed for
many years. If you want to use gpg2, the scdaemon won’t get installed
unless you install the gpgsm package:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt-get install gpgsm
</code></pre></div></div>
<p>That one I was expecting. But I thought I’d document it here anyway.</p>
<h2 id="problem-2---cant-access-the-card">Problem 2 - Can’t Access the Card</h2>
<p>This one I hadn’t seen before:</p>
<p>I got the following error with gpg2:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>grant@johnicicleboy:~$ gpg2 --card-status
gpg: selecting openpgp failed: Unsupported certificate
gpg: OpenPGP card not available: Unsupported certificate
</code></pre></div></div>
<p>gpg fails as well:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>grant@johnicicleboy:~$ gpg --card-status
gpg: selecting openpgp failed: unknown command
gpg: OpenPGP card not available: general error
</code></pre></div></div>
<p>There were a few areas where this same issue was reported, but I
couldn’t find any resolution to the problem.</p>
<p>After some extensive googling, I was able to find out that the
<code class="language-plaintext highlighter-rouge">gnome-keyring-daemon</code> now decides to grab control of your smartcard
reader. Sure enough, I killed the process and <code class="language-plaintext highlighter-rouge">gpg2 --card-status</code>
started working:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>grant@johnicicleboy:~$ gpg2 --card-status
Application ID ...: D2760001240102000005000009200000
Version ..........: 2.0
Manufacturer .....: ZeitControl
General key info..: pub 2048R/A18A54D6 2010-03-01 Grant T. Olson (Personal email) <kgo@grant-olson.net>
sec# 2048R/E3B5806F created: 2010-01-11 expires: 2014-01-03
ssb> 2048R/6A8F7CF6 created: 2010-01-11 expires: 2014-01-03
card-no: 0005 00000920
ssb> 2048R/A18A54D6 created: 2010-03-01 expires: 2014-01-03
card-no: 0005 00000920
ssb> 2048R/D53982CE created: 2010-08-31 expires: 2014-01-03
card-no: 0005 00000920
</code></pre></div></div>
<p>Now I began the search for ways to disable the smartcard functionality
on <code class="language-plaintext highlighter-rouge">gnome-keyring-daemon</code>. Couldn’t find anything. There were ways
to switch off its ssh-agent replacement, which I wanted to do anyway
since I ssh authenticate via my smartcard. There were some other
settings about pkcs11 and secrets that seemed promising. So I ran the
following commands to disable these features:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gconftool-2 --type bool --set /apps/gnome-keyring/daemon-components/ssh false
gconftool-2 --type bool --set /apps/gnome-keyring/daemon-components/secrets false
gconftool-2 --type bool --set /apps/gnome-keyring/daemon-components/pkcs11 false
</code></pre></div></div>
<p>But disabling them didn’t do the trick.</p>
<p>Next I went with a hack fix and basically nuked the gnome-keyring-daemon:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo mv /usr/bin/gnome-keyring-daemon /usr/bin/gnome-keyring-daemon.bak
</code></pre></div></div>
<p>This didn’t <em>seem</em> to have broken anything too horribly, and I never
liked the gnome keyring or seahorse to begin with. So I decided to write a blog post for the sake of the interwebz.</p>
<h2 id="but-then-a-complication">But Then, A Complication</h2>
<p>After all that I went to write things up. I decided to re-break
things so I could obtain the error message that <code class="language-plaintext highlighter-rouge">gpg --card-status</code>
threw. So I moved the <code class="language-plaintext highlighter-rouge">gnome-keyring-daemon</code> back into place.</p>
<p>Lo and behold, everything worked! Both <code class="language-plaintext highlighter-rouge">gpg</code> and <code class="language-plaintext highlighter-rouge">gpg2</code> were able to
access the card just fine.</p>
<p>I thought that maybe after I configured gpg-agent to act as the
ssh-agent, it was grabbing my smart-card before gnome-keyring-daemon
could. So I commented out the entries for that, and sure enough card
reading was broken again.</p>
<h2 id="the-proper-fix-or-is-it">The Proper Fix (or is it?)</h2>
<p>Add this to ~/.gnupg/gpg-agent.conf to enable ssh support:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>enable-ssh-support
</code></pre></div></div>
<p>Add this to ~/.bashrc to use gpg-agent for ssh instead of
gnome-keyring-daemon, substituting your host name:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if [ -f "${HOME}/.gnupg/gpg-agent-info-HOSTNAME" ]; then
. "${HOME}/.gnupg/gpg-agent-info-HOSTNAME"
export GPG_AGENT_INFO
export SSH_AUTH_SOCK
fi
</code></pre></div></div>
<h2 id="another-complication">Another Complication!</h2>
<p>Everything seemed to be working, but then I got this generic error
message from Enigmail:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>No SmartCard
could not be found in your reader
Please insert your SmartCard and repeat the operation.
</code></pre></div></div>
<p>After enabling a debug log, it turned out the error was the same
unsupported certificate error I was getting before, even though
signing still worked from the command line. Killing the
gnome-keyring-daemon process allowed me to sign emails again.</p>
<p>So, I went back to:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo mv /usr/bin/gnome-keyring-daemon /usr/bin/gnome-keyring-daemon.bak
</code></pre></div></div>
<p>And everything seems to be working… for now.</p>
<h2 id="thats-all-for-now">That’s All for Now</h2>
<p>If you’ve encountered the same problem, hopefully this will help.</p>
<p>-Grant</p>