tag:blogger.com,1999:blog-23845494474541537272024-03-04T22:20:43.527-08:00Bit BuilderMusings of a ProgrammerJustin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.comBlogger25125tag:blogger.com,1999:blog-2384549447454153727.post-28255684287116713882015-01-18T22:51:00.000-08:002015-01-18T23:08:59.532-08:00Styles in Javascript<a href="https://github.com/vjeux" target="_blank">Chris Chedeau</a> gave an excellent <a href="https://speakerdeck.com/vjeux/react-css-in-js" target="_blank">presentation</a> about using Javascript for styling.<br><br>
Inspired by Chris' talk, I decided to try it out on one of my <a href="http://github.com/jhewlett/react-reversi" target="_blank">side projects</a>, the game of Reversi (Othello) implemented in React.js <a href="http://reversi.divshot.io/" target="_blank">(demo)</a>. React is a good candidate to use Javascript for your styles because it <a href="https://github.com/jhewlett/react-reversi/blob/master/js/components/WinnerMessage.js" target="_blank">converts Javascript objects to CSS:</a> <br>
<pre class="brush: js">
render: function() {
var styles = {
textAlign: 'center',
[...]
};
return <p style={styles}>{this.props.message}</p>;
}
</pre>
Let's take a look at what we can do. Out of the box, we have <a href="https://github.com/jhewlett/react-reversi/blob/master/js/styles/cell.js" target="_blank">mixins</a> in the form of Javascript functions:
<pre class="brush: js">
var Player = require('../lib/Player');
function getBackgroundImage(player) {
if (player === Player.One) return 'url("img/red.png")';
if (player === Player.Two) return 'url("img/blue.png")';
return 'none';
}
module.exports = function(player) {
return {
backgroundImage: getBackgroundImage(player),
[...]
};
};
</pre>
We can extend styles with this mixin by <a href="https://github.com/jhewlett/react-reversi/blob/master/js/components/Cell.js" target="_blank">merging Javascript objects</a>:<br>
<pre class="brush: js">
var Player = require('../lib/Player');
var extend = require('object-assign');
var cellStyle = require('../styles/cell');
function buildStyles(owner, playerHint) {
var cellAppearance = (owner !== Player.None)
? owner
: playerHint;
return extend({
border: '1px solid black'
}, cellStyle(cellAppearance));
}
</pre>
And, of course, we have <a href="https://github.com/jhewlett/react-reversi/blob/master/js/styles/globals.js" target="_blank">variables and constants</a>:
<pre class="brush: js">
module.exports = {
fontSize: 24
};
</pre>
Hopefully the advantages to this approach over something like Sass are clear. Since it's Just Javascript, you don't need to learn another language to get the features of a CSS preprocessor.<br><br>
While I'm not completely convinced, this is a compelling approach to styling your applications, especially if you're already using React.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-3168590003886824172014-03-09T21:40:00.001-07:002014-03-09T21:40:16.222-07:00TimeboxingMost people in the agile world agree that it's good to timebox meetings. Interestingly, many in this group <i>don't</i> like the idea of having a time-boxed sprint. Instead, they prefer some sort of continuous delivery, such as Kanban.<br />
<br />
So how can we explain this divide? The argument for timeboxing meetings is pretty simple: it helps keep the meeting focused and prevents it from lasting too long. Wouldn't this be good for the development process as well? Why do we see the value in timeboxing meetings, but not the development process?<br />
<br />
I wonder if this group of people just have a sour taste in their mouths from their experience with scrum. Sprints in scrum often bring back horrible memories of day-long planning meetings and terrible estimates.<br />
<br />
I don't want to go back to that, either, but couldn't our development work benefit from some sort of soft date or at least an overriding goal to work towards? This does not need to be a firm commitment, it simply helps in setting priorities. If we explicitly state that our goal for the next few weeks is to add feature X, we'll set aside other tasks that don't help with the stated goal.<br />
<br />
Let me be clear — this does not set anything in stone. It is much more lightweight than a scrum sprint and does not carry the same level of commitment or predictability. It's also important to note that this would not rule out continuous delivery — code could still be released in the middle of the time box as soon as a logical chunk of work was finished.<br />
<br />
Just as timeboxing meetings helps us make sure we discuss the most important topics, doing the same thing with our development process helps the highest priority items surface to the top and ensures the team is on the same page.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com2tag:blogger.com,1999:blog-2384549447454153727.post-63955504577296640502013-12-02T22:40:00.001-08:002013-12-02T22:40:08.263-08:00Implicit CallbacksWhile working in C# this morning, I found myself wanting to force the creator of an object to subscribe to an event. I had something like the following: <br>
<pre class="brush: csharp">
var obj = new MyObject();
obj.RequiredEvent += EventHandler;
</pre>
My first solution was to extract a sort of factory method and use that whenever I wanted to construct the object:
<pre class="brush: csharp">
private MyObject CreateObject()
{
var obj = new MyObject();
obj.RequiredEvent += EventHandler;
return obj;
}
</pre>
This is good, but it's still possible to construct the object without calling the helper method. What I really wanted was for the compiler to enforce that someone subscribe to the event, similar to how using constructor injection ensures that an object is passed all of its dependencies.<br><br>
<h1 style="font-size: 1.2em">Callbacks</h1>
<br>
With this thought in mind, it occurred to me that if I really want to require subscribing to an event, maybe an event is not the right approach. What if the constructor simply took a callback function as a parameter, as follows?
<pre class="brush: csharp">
var obj = new MyObject(EventHandler);
</pre>
In effect, we have taken an <i>implicit</i> callback and made it <i>explicit</i>. Indeed, this is simpler and gives the compile-time check that I was after. <br><br>
<h1 style="font-size: 1.2em">Pick Your Poison</h1>
<br>
As always, it's a trade-off. The callback approach limits me to one handler, whereas an event may have multiple subscribers. Furthermore, the callback approach forces the consumer to provide a handler, whereas with the event approach it is entirely optional.
<br>
<br>
<h1 style="font-size: 1.2em">Impact of Language</h1>
<br>
The language you're using may make one approach more natural than the other. In C#, I usually reach for events. In JavaScript, I'm more likely to use a callback handler. This is mainly due to the fact that C# has language support for events, while JavaScript does not. Passing around functions is also more idiomatic in JavaScript, so callbacks are a good fit.<br><br>
In fact, were it not for my exposure to more functional languages like JavaScript, I probably wouldn't have thought of using the callback pattern at all. Yet another reason to <a href="http://bit-builder.blogspot.com/2012/08/learn-new-language.html">learn more languages</a>.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com2tag:blogger.com,1999:blog-2384549447454153727.post-22612688573357436092013-10-19T22:47:00.001-07:002013-10-19T22:47:53.062-07:00SourceDiffI wanted a new challenge, so I wrote a side-by-side <a href="https://github.com/jhewlett/SourceDiff">diff tool</a>.<br><br>
My goal was to make a simple diff tool with a clean user interface. A hundred commits later, I've started using it in my own git workflow.
<br><br>
I decided to build it using web technologies, mostly because I haven't seen many good side-by-side diff tools in the browser. In the end, though, I also wanted a desktop version so that I could integrate with git and other source control tools. For this, I used Adobe Air to build a thin executable shell that accepts command-line arguments.<br><br>
Here's a screen shot of the Air client: <br><br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOYSndpfgBcCvlX-0ENQpfY0x4cRO7ZwzMhEBKqUasrnHIp5mmJr9rPR4ltcCDNXJaOHGRLr7zhaItmL6GUVO4WYuHRRrDWq3l2RqUA4ePtn_YvDMCc8U5u7Mqh4MDH-W8DEdjcptYSHI/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOYSndpfgBcCvlX-0ENQpfY0x4cRO7ZwzMhEBKqUasrnHIp5mmJr9rPR4ltcCDNXJaOHGRLr7zhaItmL6GUVO4WYuHRRrDWq3l2RqUA4ePtn_YvDMCc8U5u7Mqh4MDH-W8DEdjcptYSHI/s400/Capture.PNG" /></a></div>
<br><br>
It's not the most fancy diff tool out there, but again, I built it mostly <a href="http://bit-builder.blogspot.com/2013/04/not-invented-here-considered-helpful.html">for the experience</a>. I used the well-known <a href="http://en.wikipedia.org/wiki/Longest_common_subsequence_problem">Longest Common Subsequence</a> problem as the basis of my diff algorithm. First, I look for and ignore lines that are common to the beginning or end of each file. This is simply an optimization step that takes advantage of the fact that usually a large portion of the code is left unchanged. Then, I apply the LCS algorithm on a line-by-line basis. That is, initially I only find inserted or deleted lines. If I find an insert and delete on the same line, then I report it as a modified line and proceed to find the character differences for these lines. In other words, for each modified line, I apply the LCS algorithm again to find which characters changed. From there, I improve the character diffs by cleaning up <a href="https://neil.fraser.name/writing/diff/">what Neal Fraser calls Semantic Chaff</a>.<br><br>
Diff tools are hard to get right. If an edit consists mostly of small changes, diff tools as a rule do a great job. However, as soon as you start doing large refactors or moving things around, most tools report a mess of changes and are less than helpful. Some diff tools try to address this by being more language-aware. <a href="http://www.semanticmerge.com/">Some tools</a> go so far as to parse the code and only report semantic changes at a higher level. <br><br>
I like what these new tools are trying to accomplish, but at the end of the day, a text diff is usually sufficient and provides a simple, precise record of everything that changed.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com2tag:blogger.com,1999:blog-2384549447454153727.post-62144706084758426032013-09-26T21:53:00.000-07:002013-09-26T21:53:01.566-07:00The Happiness MetricMeasuring the productivity of a software team is a challenging task, much to the dismay of many a manager. Still, managers often grasp for metrics, as if by tweaking variable X or Y their developers will suddenly become more productive. <br><br>
Unfortunately, most metrics in software miss the mark. At best, measuring any sort of development activity causes developers to focus on improving that metric at the expense of other valuable activities. At worst, it causes developers—consciously or not—to game the system. <br><br>
Let's consider the metric of code coverage. Suppose the department sets a policy that any new code that goes into the system must have 80% code coverage. The managers get their beloved data, and the quality of the code will certainly improve—right? <br><br>
Now let's assume the developers make an honest effort to meet that goal. They may even go so far as to test all of their getters and setters. <br><br>
Great, they met the goal, but is the quality of the code improved? Not necessarily. Most would agree that testing getters and setters is a waste of time. Further, just because code is covered by a test does not mean that the test is meaningful. <br><br>
What about measuring the number of features shipped? This may work fine in the short term, but it definitely does not take into account code quality, or the value of said features to the customer. <br><br>
The truth is, good developers thrive when granted autonomy. Setting arbitrary metrics for them to meet undermines whatever autonomy you claim to extend. It takes the "self" out of self-organized teams. It follows then that, ideally, metrics should grow within the team rather than be imposed by management.<br><br>
Personally, the only metric I trust is a <i>gut feeling</i>. I call it the Happiness Metric. <br><br>
To convert my gut feeling into hard data, I take a short survey every few weeks with prompts like the following, rated on a scale from 1 to 5: <br><br>
<ul>
<li>
I delivered value to my customers
</li>
<li>
I focused on quality
</li>
<li>
I worked at a sustainable pace
</li>
<li>
I was intellectually engaged
</li>
<li>
I was happy
</li>
</ul>
<br>
For individuals and teams, these questions will cause introspection and serve as a discussion point. They will help you evaluate whether you worked on the most important thing for your customers. <br><br>
It may not be exactly what management is looking for, but I don't know of any better way to measure the productivity of a team. <br><br>
<i>I'd love to hear about metrics that have worked for</i> your <i> teams. Follow me on twitter <a href="https://twitter.com/justin_hewlett">@justin_hewlett</a>.</i>Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-43401018274796836072013-08-08T22:47:00.000-07:002013-08-08T22:47:42.819-07:00CheckDoSomething()I have gotten a lot of use out of what I might call the Check Execute Method pattern. The general idea is that, often times, you want to conditionally execute some action. For example, you may have something like the following appear in several different places: <br>
<pre class="brush: ruby">
if condition
doSomething()
end
</pre>
If you're not careful, code like this may get sprinkled all over the place. If many callers are checking the same condition, we can DRY up the code as follows:<br>
<pre class="brush: ruby">
def checkDoSomething(condition)
if condition
doSomething()
end
return condition
end
</pre>
Now we have encapsulated the decision of whether to execute the method. We no longer need to repeat the check everywhere. The 'check' prefix in the method name makes it clear that the execution of the action is dependent on the parameters or state of the object. Additionally, if the caller needs to know whether or not the action was performed, they can use the boolean return value.<br><br>
Some may disagree with returning a success value, claiming that this violates command query separation. Fine, but what is the alternative? We could rely on the caller to check some condition before calling the method, but this is problematic if we have more than a few callers who are all checking the same thing. A caller may even forget to make the check.<br><br>
Alternatively, we could change the method to do the check internally and throw an exception if the condition is not met. But then we're using exceptions for control flow, and each caller would need to wrap the method call in a try-catch block.<br><br>
There are trade-offs to be made, but I think this approach strikes a nice balance.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-75178366357074606162013-07-23T22:35:00.003-07:002013-07-23T22:37:08.417-07:00Syntax HighlightingIn my <a href="http://bit-builder.blogspot.com/2013/06/tokenize-this.html">previous post</a> about lexers, I mentioned in passing that a lexer could be used to implement a syntax highlighter. Well, I decided to do <a href="https://github.com/jhewlett/SyntaxMarker">just that</a>. <br />
<br />
With the lexer already written, the rest was fairly straightforward — mostly just grabbing the snippet from the DOM and surrounding each token with the correct CSS class. <br />
<br />
Here's some example HTML that demonstrates how to use it to highlight a snippet of Ruby: <br />
<br />
<pre class="brush: xml"><html>
<html>
<head>
<title></title>
<link rel="stylesheet" type="text/css" href="../src/style.css"/>
</head>
<body>
<pre class="syntax-marker-highlight ruby">
def test()
strings = 'single' + "double"
sym = :symbol
num = 4
return strings + sym.to_s + num.to_s #stringify
end
</pre>
<script src="../lib/lexer.js"></script>
<script src="../src/syntaxMarker.js"></script>
<script src="../src/markers/rubyMarker.js"></script>
<script type="text/javascript">
SyntaxMarker.mark();
</script>
</body>
</html>
</pre>
<br>
This would render the following:<br><br>
<pre style="display:inline-block; background-color: #FFFFDD;border: 1px solid black; padding: 10px; "><code style="color: blue; font-weight: bold">def</code> <code class="identifier">test</code>()
<code class="identifier">strings</code> = <code style="color: red">'single'</code> + <code style="color: red">"double"</code>
<code class="identifier">sym</code> = <code style="color: cornflowerblue">:symbol</code>
<code class="identifier">num</code> = <code style="color: orange">4</code>
<code style="color: blue; font-weight:bold">return</code> <code class="identifier">strings</code> + <code class="identifier">sym</code>.<code class="identifier">to_s</code> + <code class="identifier">num</code>.<code class="identifier">to_s</code> <code style="color: green">#stringify</code>
<code style="color: blue; font-weight: bold">end</code>
</pre>
<br>
<a href="https://github.com/jhewlett/SyntaxMarker">SyntaxMarker</a> is still a work in progress, but the core functionality is there. At this point, all that needs to be done is to add support for more languages. This is as simple as writing some regular expressions for language keywords, identifiers, etc.
<br><br>Again, it was neat to see that most of the actual work is done by the lexer. From there, we can use the resulting tokens for many different purposes.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-30635758269548659032013-06-05T20:31:00.001-07:002013-06-07T21:38:01.201-07:00Tokenize ThisAs a man of my word, I have <a href="http://bit.ly/NotInvHere">reinvented the wheel</a> once again. I wrote a simple <a href="http://bit.ly/TokenJS">lexer</a> in JavaScript.<br />
<br />
A lexer (or scanner) is a tool in programming language design that is responsible for breaking an input string into individual tokens. These tokens can then be handed to a parser for further processing. For example, consider the following statement in a hypothetical language:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">double := 2 * x //double the number</span><br />
<br />
Here, our lexer might give back the following tokens:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">ID (double), ASSIGN, NUMBER, TIMES, ID (x)</span><br />
<br />
Note that the lexer discarded the whitespace and the comment. The rest of the input was matched to a predefined token.<br />
<br />
Most lexers use regular expressions to specify the matching rules. Each of these rules, when matched, returns some specified token. Here is how it would look in <a href="http://bit.ly/TokenJS">Token.JS</a>:<br />
<br />
<pre class="brush: javascript">var lexer = new TokenJS.Lexer(
'double := 2 * x //double the number', {
root: [
[/\/\/.*/, TokenJS.Ignore], //ignore comments
[/\s+/, TokenJS.Ignore], //ignore whitespace
[/[a-zA-Z]+/, 'ID'],
[/[0-9]+/, 'NUMBER'],
[/:=/, 'ASSIGN'],
[/\*/, 'TIMES']
]
});
var tokens = lexer.tokenize();
</pre>
<br />
Even if you're not writing a compiler, lexing has many other uses. For starters, imagine you wanted to write a syntax highlighter for a language. This would be trivial with a lexer. You would simply specify a regular expression for identifiers, literals, reserved words, etc. in the given language. The 'token' in each case could be a color to highlight the match with.<br />
<br />
In general, string processing is a big part of many applications, so it's good to know what tools you have at your disposal.<br />
<br />Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-44592971830426988862013-04-19T21:37:00.001-07:002013-04-19T21:37:43.300-07:00"Not Invented Here" Considered Helpful<span style="font-size: 15px; font-family: Arial;">We are constantly told not to "re-invent the wheel," as if doing so would be an utter waste of time. Yet doing so is extremely worthwhile, if only for the sake of experience alone. Maybe a better way to put it would be "dissect the wheel, learn about all of its intricacies, then rebuild it from scratch." Maybe someone has already done it. Maybe they've done it very well. But none of that matters. Who says your design won't be better? Who says it won't better apply to your needs? And what of the grand sense of accomplishment that you've done something by yourself? <br><br>
In programming, <a href="http://en.wikipedia.org/wiki/Not_invented_here#In_computing">Not Invented Here</a> is spoken of as strictly a bad thing. "There's a library for that," they say. They're right—there probably is a library, but there's something critical missing in these discussions. Using a library does not necessarily give you understanding as to how it works. <br><br>
In my undergraduate compilers course, we used a few libraries to help us parse our language. But before using a new library, the professor would ensure that we had a clear understanding of how it worked. Before using regular expressions, for example, he made sure we had a good background in finite automata and state machines. "Given enough time," he would ask, "could you have implemented this yourself?" <br><br>
More and more, I'm convinced that we should have an intimate understanding of the libraries that we use. Where necessary, we should sculpt them, tear them apart—even rewrite them. <br><br>
Now, when you're on the clock, your opportunities for exploration may be limited. But at a bare minimum, take the time to do so in your personal projects. After all, why else are you doing personal projects, if not to learn and have fun?</span>Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com1tag:blogger.com,1999:blog-2384549447454153727.post-67207568262283523412013-02-25T18:47:00.002-08:002013-04-05T15:18:58.348-07:00Cut the Ceremony<span style="font-size: 15px; white-space: pre-wrap;"><span style="font-family: Arial;">I'm happy to see a growing interest in languages that attempt to cut out the ceremony—things in a language, often keywords, that take your attention away from the code you actually care about. Often these are things that you type only to appease the compiler, or to make the language easier to parse (I'm looking at you, semicolon). To see what I mean, compare</span></span>
<span style="font-size: 15px; white-space: pre-wrap;">
</span><br />
<pre class="brush: csharp">public virtual void Method(IFactory arg)
{
arg.DoStuff();
}</pre>
<span style="font-size: 15px; white-space: pre-wrap;"><span style="font-family: Arial;">to Ruby's</span></span>
<span style="font-size: 15px; white-space: pre-wrap;">
</span><br />
<pre class="brush: ruby">def method arg
arg.doStuff
end</pre>
<span style="font-size: 15px; white-space: pre-wrap;">
</span>
<span style="font-family: Arial;"><span style="font-size: 15px; white-space: pre-wrap;">In this case, Ruby does away with ceremony mainly through dynamic typing. This frees the programmer from having to specify types in method arguments or returns. The downside to this approach, of course, is that you won’t find out about type errors until run time. While this problem can be mitigated through discipline and good testing, larger projects are arguably better off with static type checking—that is, unless you really need the flexibility that dynamic languages provide.</span></span><br />
<span style="font-family: Arial;"><span style="font-size: 15px; white-space: pre-wrap;"><br /></span></span>
<span style="font-family: Arial;"><span style="font-size: 15px; white-space: pre-wrap;">In fact, there <i>are </i>languages that manage to reduce ceremony without sacrificing static checking. Scala, for example, is a compiled, statically typed language that almost looks like Ruby if you squint hard enough. Scala is able to reduce the noise by leveraging such things as type inference and implicit returns. Consider the following example seen <a href="http://pragprog.com/magazines/2011-10/scala-for-the-intrigued">here</a>:
</span></span><br />
<pre class="brush: scala">def doubleValue(number : Int) : Int = {
return number * 2
}</pre>
<span style="font-family: Arial;"><span style="font-size: 15px; white-space: pre-wrap;">
By making the return implicit, it can be reduced to the following:
</span></span><br />
<pre class="brush: scala">def doubleValue(number : Int) = {
number * 2
}</pre>
<span style="font-family: Arial;"><span style="font-size: 15px; white-space: pre-wrap;">
</span></span>
<span style="font-family: Arial;"><span style="font-size: 15px; white-space: pre-wrap;">Finally, we can eliminate the braces since the function fits on a single line:</span></span>
<br />
<pre class="brush: scala">def doubleValue(number : Int) = number * 2
</pre>
<span style="font-family: Arial;"><span style="font-size: 15px; white-space: pre-wrap;">Thus, languages like Scala are able to achieve the terseness of Ruby while maintaining the static type safety and performance of C#.</span></span><br />
<span style="font-family: Arial;"><span style="font-size: 15px; white-space: pre-wrap;"><br /></span></span>
<span style="font-family: Arial;"><span style="font-size: 15px; white-space: pre-wrap;">QED.</span></span>Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-27561048005996172652013-01-16T22:24:00.001-08:002013-01-16T22:31:45.673-08:00Tests or Clean Code: You PickIf you had to choose between clean code and untidy code that is covered by tests, which would you pick?<br />
<br />
The more that I do TDD, the more I have come to realize a minor flaw: TDD optimizes for test coverage, at the potential expense of untidy code. <span style="color: #333333; font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif; font-size: 13px; line-height: 19px;">It's entirely possible with TDD to </span>end up with code that passes all the tests, yet is not particularly readable. Now, if the "Refactor" step is performed properly, this can mitigate the problem. But even with refactoring, TDD's baby-steps approach to solving a problem does not always produce the most readable code. Too often, we are focused only on passing the current test and we lose sight of the overall design (assuming, of course, that we have a design).<br />
<br />
The classic alternative is to do some initial design work, code, then cover the code with tests. But one of the main arguments for TDD is that writing tests after the fact often gets neglected by lazy programmers. Then who is to say that the same thing won't happen with TDD? That is, the "Refactor" step can fall by the wayside just as easily.<br />
<br />
The question, then, is what are you optimizing for?<br />
<br />
If the "Refactor" step is neglected, TDD can lead to working, yet untidy, code. Upfront design can lead to clean code, yet you haven't proved that it works if you fail to cover it with tests afterwards.<br />
<br />
If it came down to it, I would probably pick covered, messy code. The next developer to work in the code would be less likely to break my functionality if there are tests in place, even if it initially may be less readable.<br />
<br />
Ideally, of course, there would be no need to make such a compromise in the first place. What are your thoughts? How can we avoid this?Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com2tag:blogger.com,1999:blog-2384549447454153727.post-64522430898970844072012-12-23T22:10:00.001-08:002012-12-23T22:34:31.759-08:00Basement Hackers and Developer ElitismI have read a lot of blogs complaining about the state of affairs among developers. Jeff Atwood <a href="http://www.codinghorror.com/blog/2007/02/why-cant-programmers-program.html">asks</a>, "Why Can't Programmers.. Program?" Jason Gorman <a href="http://codemanship.co.uk/parlezuml/blog/?postid=1155">goes so far</a> as to say:<br />
<blockquote class="tr_bq">
<span style="background-color: white; color: #333333; font-family: verdana, helvetica; font-size: x-small; line-height: 18px;">Consider that not all developers are equal, and some developers achieve more than others. In reality, 80% of the working code in operation today can probably be attributed to small proportion of us. The rest just get in the way. If anything, if we thinned down the herd to just the stronger programmers, more might get done.</span></blockquote>
Certainly, skill level varies among developers. But to make ridiculous claims like this is nothing more than a case of developer elitism and almost sounds like some sort of ethnic cleansing campaign.<br />
<br />
What's really going on is that, due to the advent of affordable PCs and the internet, it's possible for someone who has never set foot in a university to learn to code from online tutorials. Тhe barrier to entry is lower than ever before. This is not necessarily a bad thing. But I can see why it would cause the older folks to become bitter.<br />
<br />
The same thing happened with photography. It used to be that in order to become a photographer, you needed access to a dark room. You needed to know about the different chemicals. You needed to work for some time as an apprentice.<br />
<br />
When digital photography and photo editing came along, the barrier to entry became significantly lower. And, of course, some seasoned professionals are now bitter that legions are becoming photographers without ever needing to enter a darkroom.<br />
<br />
A traditional education in computer science is extremely helpful. It's important to have a foundation in algorithms and data structures. But there are plenty of people who are successful developers that have obtained their education through different means.<br />
<br />
Instead of pointing fingers at newcomers and attempting to "thin down the herd" as Jason puts it, we should welcome to the fold those who truly have a desire to become great programmers.<br />
<br />
Now, were I in a position to hire developers, I would definitely test their ability to code. This is one case where it's appropriate to evaluate skills and make a judgement call. But to make blanket statements like those above is just plain ignorant.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-17418613328190397392012-12-03T22:31:00.002-08:002013-07-10T21:03:07.867-07:00Code OwnershipIn economics, the Tragedy of the Commons tells of a proverbial pasture that is common to several herders. The observation goes <a href="http://en.wikipedia.org/wiki/Tragedy_of_the_commons#Theories_and_examples">as follows</a>: "...it is in each herder's interest to put the next (and succeeding) cows he acquires onto the land, even if the quality of the common is damaged for all as a result, through overgrazing. <b>The herder receives all of the benefits from an additional cow, while the damage to the common is shared by the entire group.</b> If all herders make this individually rational economic decision, the common will be depleted or even destroyed, to the detriment of all."<br />
<br />
I often wonder how this principle applies to a software team. Imagine that you are adding a feature that requires you to add some methods to an existing class that is already pretty big. The simplest thing to do is to just add the methods and move on. It's a tempting choice because it provides immediate results to you, and the costs won't be borne until a later date. At that time they'll be shared across the whole team and not borne by you alone. It takes a disciplined programmer to take the additional steps to pull out the methods into a new class in hopes of reducing future team maintenance.<br />
<br />
How do we solve this dilemma? In the case of the pasture, the economist would say to give a private pasture to each herder that he alone is responsible for. Can this same solution be applied to our software teams?<br />
<br />
I think a certain sense of code ownership is beneficial. This allows a developer to specialize in a certain part of the system. She will be more likely to keep the code clean if she knows she'll be working in it again in the future. She'll be familiar with the code whenever a feature needs to be added there. The problem with this, though, is that developers typically have big egos. If you are given "ownership" over a section of code, you will be less open to other developers making improvements to your algorithms or design. It discourages team collaboration and dissemination of knowledge.<br />
<br />
At the other extreme, you may get exposure to more parts of the system, but you would likely not have any personal investment in any specific section of the code.<br />
<br />
I think the solution is some combination of the two. When working on a particular feature or bug fix, focus your efforts on getting to know that area of code. Focus your clean up efforts in that area, knowing that you (or a team member) are likely to encounter the code again in the future. When you are finished, you can move on to another section of code and repeat the process. Over time, each member of the team will ideally have some sense of ownership in each module of the system.<br />
<br />
Here are some practices that I think provide the appropriate balance of ownership:<br /><br />
<h3>
Small Teams</h3> <br />
Ideally, teams should consist of 3-4 developers working on a particular, well-defined feature. This means that each developer has a significant investment in the feature, sharing at least a quarter of the responsibility for the quality of the code. Because the team is often working in the same section of code for the duration of the feature, there is a greater incentive to keep the code clean.<br /><br />
<h3>
Code Reviews</h3><br />
Code reviews, whether done continually through pairing or just before check-in, serve as an extra opportunity to consider the impact of your changes on the code base, including how it will affect future maintenance by the team as a whole.<br /><br />
<h3>
Ownership of Bugs</h3><br />
Where reasonable, any bugs that come up during development ought to be addressed by the individual that caused them. Attitude is important here. This is not merely an exercise of pointing blame and public humiliation. Rather, that developer is often more familiar with the section of code. This also helps developers to have ownership of the quality of the code that they produce.<br />
<br />
The key here is that <b>successful systems are written by developers who take personal responsibility for their code, yet at the same time realize they are part of a team.</b> With the understanding that everyone likes to work in a clean code base, the team, and each individual, must commit to practices that produce quality, maintainable code.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-12175727905414399252012-10-29T22:46:00.001-07:002012-10-30T11:02:59.437-07:00Don't Parse with RegExAt some point, I think all developers have tried to use regular expressions to parse input. It may work at first, but it quickly becomes unwieldy for all but the most trivial of inputs.<br />
<br />
Engineers learned early on that it simplifies things drastically to separate the compiler into separate modules – first find the individual tokens, then use a grammar to see if the tokens are arranged correctly, then assign semantic meaning to the statements, and so forth.<br />
<br />
Regular expressions are great for getting the individual tokens. This is the scanning, or lexing phase. With the tokens in hand, we're ready to do the actual parsing. To do this properly, we need to specify a formal grammar. This is typically done in BNF. Grammars can deal with nesting and other constructs that regular expressions don't handle well. For example, imagine trying to match up opening and closing sets of parentheses using regular expressions. This is trivial to specify in a grammar.<br />
<br />
Separation of concerns is another advantage of doing it this way. If we truly follow the single responsibility principle, we have no justification for trying to lex and parse at the same time.<br />
<br />
Regular expressions are powerful tools by themselves, but don't abuse them – especially if you're dealing with a full-fledged domain-specific language.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-89066113532289041532012-09-16T20:53:00.000-07:002012-09-16T20:53:22.893-07:00State vs. Interaction TestingIn my post <a href="http://bit-builder.blogspot.com/2012/06/unit-testing-and-assumptions.html">Unit Testing and Assumptions</a>, I described my ideal type of test: <span style="font-family: inherit;">"<span style="background-color: white; line-height: 18px;">I don't care how [the algorithm does] it, just that the output is what I expect. </span><span style="background-color: white; line-height: 18px;">These tests are short, easy to write, and they make no assumptions about the underlying code." What I didn't realize at the time is that I was simply making the case for state-based testing.</span></span><br />
<span style="font-family: inherit;"><span style="background-color: white; line-height: 18px;"><br /></span></span>
State-based testing focuses on the <i>results</i> of a computation, not on the specific steps of an algorithm. Instead of verifying that <span style="font-family: Courier New, Courier, monospace;">add(2, 2)</span> was called on a mock, for example, we simply assert that the result is <span style="font-family: Courier New, Courier, monospace;">4</span>. This makes the tests less brittle and usually easier to write because of less setup. It's also better from a TDD perspective since you <a href="http://martinfowler.com/articles/mocksArentStubs.html#SoShouldIBeAClassicistOrAMockist">don't need to know the details of the code-to-be-implemented in order to write your failing test.</a> For better or worse, these type of tests can quickly turn into <a href="http://martinfowler.com/articles/mocksArentStubs.html#TestIsolation">"mini-integration"</a> tests since they lack the test isolation that mocks provide.<br />
<br />
<span style="background-color: white; line-height: 18px;">State-based testing stands in contrast to interaction or behavioral testing. Here the focus is, well, on the <i>interaction</i> of the system under test with a mock object. This can be useful when the method delegates some work to a collaborator. The collaborator object is already tested, so we just need to verify that the delegation takes place with the correct parameters. Indeed, this is essentially the only way to unit test code that makes calls to a web service or database. The downside, of course, is that the test locks down the behavior of the code. If that behavior ever changes, even if the results stay the same, we must update the test accordingly.</span><br />
<br />
I tend to be a proponent of state-based testing whenever possible. I use it in cases where my class under test has no collaborators, or when the collaborators are lightweight. I like that this type of testing <a href="http://martinfowler.com/articles/mocksArentStubs.html#DesignStyle">encourages return values over side-effects</a>. I'm also usually more concerned that the results themselves are correct than how the code computed them.<br />
<br />
If I have a good reason to isolate the class completely, then I use mocks to verify behavior. This is usually on the seams of the system, such as near the data access layer.<br />
<br />
Ultimately, I end up doing whatever feels right at the time given the situation.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-82825316991446069722012-08-21T10:13:00.001-07:002012-09-16T20:56:09.002-07:00Learn a New LanguageStudents of linguistics have probably heard of the <i>Sapir-Whorf Hypothesis.</i> It essentially states that <b>the languages you know directly influence your understanding of the world</b>. In other words, the languages that you know influence how you think about and approach problems.<br />
<br />
Though often applied to natural languages, this can certainly be applied to programming languages as well. If we have had exposure to multiple programming language paradigms, such as declarative, functional, and imperative, we will be better at solving problems by choosing the right tool for the job.<br />
<br />
Certain language paradigms work better for some certain classes of problems. For example, query languages such as SQL work well as declarative languages because we're more interested in <i>what</i> to find rather than the exact steps of finding it. Declarative languages are also a great way to specify the view for a program. We see this with XML for Android and XAML for Windows WPF applications.<br />
<br />
Then we have functional languages, which extol the virtues of pure functions that are free from state and side effects. This makes it easier to reason about and test our programs. It also allows us to take advantage of lazy evaluation and to parallelize our programs quite trivially.<br />
<br />
Finally, imperative languages allow us to specify <i>how</i> to accomplish a given task. This is important when performance is a concern or when we need more granularity.<br />
<br />
Fortunately for us, many of the newer languages are multi-paradigm, allowing us to use declarative, functional, and imperative ideas in a single language.<br />
<br />
So <b>learn a new language</b>, preferably of a paradigm that you're not as familiar with. Even if you rarely utilize the language itself, the concepts gleaned from that language will make you a better programmer and problem-solver.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com2tag:blogger.com,1999:blog-2384549447454153727.post-77257297397391273542012-08-16T11:27:00.001-07:002012-08-16T11:29:58.586-07:00Back to the TerminalOne of the big problems of GUI programming is trying to separate business logic from the presentation. Many frameworks and patterns, such as <a href="http://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller">Model-view-controller</a>, exist for just this purpose.<br />
<br />
I recently came across a less common approach to this problem. From the <a href="https://stackoverflow.fogbugz.com/default.asp?W29030">StackOverflow Podcast #41</a>:<br />
<blockquote class="tr_bq">
<span style="background-color: white;"><span style="font-family: Georgia, Times New Roman, serif;"><u>Atwood</u>: ...the classic UNIX way of developing a GUI was you <b>start with a command-line app</b>, which has a defined set of inputs and outputs, pipes, basically text going in and out of it.<br /><u>Spolsky</u>: A lot of command-line arguments.<br /><u>Atwood</u>:<b> And then you put a GUI on top of that, so then you have perfect separation.</b> You don't necessarily have a great app which is the deeper problem, but you have perfect separation, because you can test the command line independently...</span></span></blockquote>
Seems like an interesting solution, especially if your target audience is expert users who may prefer a command-line application anyway.<br />
<br />
Unknowingly, this is the approach I took with my <a href="http://bit-builder.blogspot.com/2012/06/my-first-open-source-project.html">Batch Photo Editor</a>. From the beginning, it was designed to be a command-line application. This allowed me to focus on the core logic of the application and not have to worry about presentation. Later, I will always have the option of adding a GUI, in which case I would get the business-presentation separation for free.<br />
<br />
The downside of this approach is that, if you're not careful, the resulting GUI may be nothing more than a wrapper around your original command-line interface. If your main goal is an extremely user-friendly GUI, perhaps you're better off with <a href="http://www.codinghorror.com/blog/2008/04/ui-first-software-development.html">UI-First Development</a>. When possible, though, it's refreshing to be able to focus on the actual <i>code</i> that you're writing and not the syntax to wire up a button click handler.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-83045788414031022262012-08-12T15:35:00.001-07:002012-08-12T20:06:04.696-07:00Refactor at Your Own RiskOften when we're left to work in someone else's code, we think it's poorly written and seek to improve it or, perhaps worse, rewrite it entirely. <b>Refactor at your own risk, however. Doing so often has unintended consequences or provides little value. </b>Here are my general guidelines:<br />
<br />
<h4>
<ul>
<li>Don't refactor code that won't change</li>
</ul>
</h4>
<div>
<br /></div>
<div>
When reworking code, you get the biggest bang for your buck with the code that your programmers work in nearly <i>every day</i>. You know, those places in code that are constantly changing when bugs are discovered or requirements change.</div>
<div>
<br /></div>
<div>
Now contrast this with refactoring that class that hasn't changed in years. Generally we refactor a piece of code so that it will be easier to make changes in the future. If a class is fairly static<span style="font-size: x-small;">*</span>, then refactoring provides little to no benefit and only increases the risk of adding new bugs.<br />
<br />
<h4>
<ul>
<li>Refactor in increments</li>
</ul>
</h4>
</div>
<div>
<br /></div>
<div>
Refactoring works best when done in small increments. Joel Spolsky reminds us to <a href="http://www.joelonsoftware.com/articles/fog0000000069.html">avoid the big rewrite</a>:<br />
<blockquote class="tr_bq">
<div style="background-color: #f5f4df; line-height: 20px; margin-bottom: 1em;">
<span style="font-family: Georgia, Times New Roman, serif;">We're programmers. Programmers are, in their hearts, architects, and the first thing they want to do when they get to a site is to bulldoze the place flat and build something grand. We're not excited by incremental renovation: tinkering, improving, planting flower beds.</span></div>
<div style="background-color: #f5f4df; line-height: 20px; margin-bottom: 1em;">
<span style="font-family: Georgia, Times New Roman, serif;">There's a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: <i>they are probably wrong.</i> The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming:</span></div>
<div style="background-color: #f5f4df; font-size: 19px; font-weight: bold; line-height: 24px; margin-bottom: 1em; text-align: center;">
<span style="font-family: Georgia, Times New Roman, serif;">It’s harder to read code than to write it.</span></div>
</blockquote>
</div>
<h4>
<ul>
<li>Be wary of refactoring uncovered code</li>
</ul>
</h4>
<div>
<br /></div>
<div>
Refactoring is best done when there is a suite of unit tests that assert the correct behavior of the class. In the absence of tests, consider first writing some tests or perhaps avoid refactoring altogether.</div>
<br />
<ul>
<li><b>Use refactoring tools</b></li>
</ul>
<br />
Where possible, use tools that will perform a variety of common refactors, such as renaming variables, extracting methods, and changing method signatures.<br />
<br />
Don't rework code for no reason. Make those changes that will increase the readability and maintainability of your code. As always, the benefits must be weighed against the risks. Remember that, to your users, <a href="http://www.codinghorror.com/blog/2007/12/nobody-cares-what-your-code-looks-like.html">a working product is king</a>.<br />
<br />
<span style="font-size: x-small;">*no pun intended</span>Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-89070934278670206822012-08-06T21:49:00.000-07:002012-08-23T16:32:24.255-07:00Object-Relational MappingI've been thinking a lot lately about the object relational mismatch. It's interesting to note that relational databases and object-oriented programming basically evolved around the same time, yet they're not very compatible with each other.<br />
<br />
There's been a lot of good posts on the subject, such as Jeff Atwood's <a href="http://www.codinghorror.com/blog/2006/06/object-relational-mapping-is-the-vietnam-of-computer-science.html">Object-Relational Mapping is the Vietnam of Computer Science</a>. He basically concludes that there are 4 solutions to the problem: Give up relational databases, give up objects, manually map between them, or use an object-relational mapper.<br />
<br />
Recently I had concluded that object relational mapping is the way to go. Now I'm not so sure. As I was looking at some example code for a Python ORM called <a href="http://charlesleifer.com/blog/peewee-a-lightweight-python-orm/">Peewee</a>, I realized something: with ORMs, you are still essentailly defining all of your models as <i>relations</i>. You're just doing it in code rather than in SQL. For example consider this example model definition for a blog from the <a href="http://peewee.readthedocs.org/en/latest/peewee/models.html">Peewee documentation</a>:<br />
<pre style="-webkit-box-shadow: rgb(216, 216, 216) 1px 1px 1px; background-color: white; border: 1px solid rgb(198, 201, 203); color: #222222; font-size: 1.1em; line-height: 1.2em; margin-bottom: 1.5em; margin-top: 1.5em; overflow-x: auto; overflow-y: hidden; padding: 10px;"><span class="k" style="color: #007020; font-weight: bold;">class</span> <span class="nc" style="color: #0e84b5; font-weight: bold;">Blog</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="n">name</span> <span class="o" style="color: #666666;">=</span> <span class="n">CharField</span><span class="p">()</span> <span class="c" style="color: #408090; font-style: italic;"># <-- VARCHAR</span>
<span class="k" style="color: #007020; font-weight: bold;">class</span> <span class="nc" style="color: #0e84b5; font-weight: bold;">Entry</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="n">headline</span> <span class="o" style="color: #666666;">=</span> <span class="n">CharField</span><span class="p">()</span>
<span class="n">content</span> <span class="o" style="color: #666666;">=</span> <span class="n">TextField</span><span class="p">()</span> <span class="c" style="color: #408090; font-style: italic;"># <-- TEXT</span>
<span class="n">pub_date</span> <span class="o" style="color: #666666;">=</span> <span class="n">DateTimeField</span><span class="p">()</span> <span class="c" style="color: #408090; font-style: italic;"># <-- DATETIME</span>
<span class="n">blog</span> <span class="o" style="color: #666666;">=</span> <span class="n">ForeignKeyField</span><span class="p">(</span><span class="n">Blog</span><span class="p">)</span> <span class="c" style="color: #408090; font-style: italic;"># <-- INTEGER referencing the Blog table</span></pre>
Notice that we have to specify the <i>type of a database column</i> with each of our attributes. Since Python has no explicitly typed variables from which to infer this information, I guess this is permissible. Further notice that our entry has a <span style="font-family: Courier New, Courier, monospace;">ForeignKeyField </span><span style="font-family: inherit;">that specifies the relationship between </span><span style="font-family: Courier New, Courier, monospace;">Entry </span><span style="font-family: inherit;">and </span><span style="font-family: Courier New, Courier, monospace;">Blog</span><span style="font-family: inherit;">. <b>What we're left with is little more than a razor-thin layer of abstraction put on top of a relational database.</b></span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">These objects would look quite a bit different if we weren't so concerned about mapping them. In particular, </span><span style="font-family: Courier New, Courier, monospace;">Blog </span><span style="font-family: inherit;">would likely have a list of </span><span style="font-family: Courier New, Courier, monospace;">Entry</span><span style="font-family: inherit;">, rather than each </span><span style="font-family: Courier New, Courier, monospace;">Entry </span><span style="font-family: inherit;">having a "foreign key" to </span><span style="font-family: Courier New, Courier, monospace;">Blog</span><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">.</span><br />
<span style="font-family: inherit;"><br /></span>
Maybe making our classes look more like relations is the price we pay. It's certainly more appealing than maintaining manual mapping code, and most businesses are wary of using an object store that the developers essentially have exclusive control over.<br />
<br />
This is a hard problem. I guess that's why Ted Neward calls it the <a href="http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx">Vietnam of Computer Science</a>.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com2tag:blogger.com,1999:blog-2384549447454153727.post-27525097936894630032012-07-19T23:03:00.002-07:002012-07-19T23:03:47.499-07:00Privates Done RightWhen I started learning Python, I quickly learned that there is no way to officially make a member variable private. There is only a convention that any variable starting with a single or double underscore should be treated as an implementation detail that is likely to change. If you try to access it, however, Python will not stop you.<br />
<br />
I think that this is the right approach. Sometimes there are legitimate reasons for accessing a private variable. For example, when testing legacy code, or using a third-party API. When I have a legitimate reason to do so, I don't want to have to go through some laborious process to get at the data, such as reflection.<br />
<br />
Of course, with new code, the real solution is to write more modular classes with a single responsibility. This way, there are more public entry points to test classes in isolation.<br />
<br />
For example, consider this trivial class that performs a calculation and writes the result to the console:<br />
<br />
<span style="font-family: 'Courier New', Courier, monospace;">class Logger(object):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> def write(value):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> print value</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"><br /></span><br />
<span style="font-family: 'Courier New', Courier, monospace;">class DataDisplayer(object):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> def __init__(self, logger):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> self.__logger = logger</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"><br /></span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> def display(self, value1, value2):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> value = self.__do_calculation(value1, value2)</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> self.</span>
<span style="font-family: 'Courier New', Courier, monospace;">__logger</span><span style="font-family: 'Courier New', Courier, monospace;">.write(value)</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"><br /></span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> def __do_calculation(value1, value2):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> return value1 * value2</span><br />
<br />
<span style="font-family: inherit;">In order to test the </span><span style="font-family: 'Courier New', Courier, monospace;">__do_calculation</span><span style="font-family: inherit;"> method, we have a few options. We can call the pseudo-private method directly. Or we can test it through it's public interface, </span><span style="font-family: 'Courier New', Courier, monospace;">display</span><span style="font-family: inherit;">. This isn't desirable, though, because it has the side effect of writing a value to the screen.</span><br />
<span style="font-family: inherit;"><br /></span><br />
<span style="font-family: inherit;">The ideal solution is to the pull the responsibility of </span><span style="background-color: white;"><span style="font-family: 'Courier New', Courier, monospace;">__do_calculation</span><span style="font-family: inherit;"> into its own class so we can test that in isolation:</span></span><br />
<span style="background-color: white;"><span style="font-family: inherit;"><br /></span></span><br />
<span style="background-color: white;"><span style="font-family: 'Courier New', Courier, monospace;"></span></span><br />
<span style="font-family: 'Courier New', Courier, monospace;">class </span>
<span style="font-family: 'Courier New', Courier, monospace;">DataDisplayer</span> <span style="font-family: 'Courier New', Courier, monospace;">(object):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> def __init__(self, logger, calculator):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> self.__logger = logger</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> self.__calculator = calculator</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"><br /></span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> def </span><span style="font-family: 'Courier New', Courier, monospace;">display</span><span style="font-family: 'Courier New', Courier, monospace;">(self, value1, value2):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> value = calculator.do_calculation(value1, value2)</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> self.__logger.write(value)</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"><br /></span><br />
<span style="font-family: 'Courier New', Courier, monospace;">class Calculator(object):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> def do_calculation(value1, value2):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> return value1 * value2</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"><br /></span><br />
<span style="font-family: inherit;">Now we have a public interface to test </span><span style="font-family: 'Courier New', Courier, monospace;">do_calculation </span><span style="font-family: inherit;">in isolation.</span><br />
<span style="font-family: inherit;"><br /></span><br />
<span style="font-family: inherit;">In other words, when we write our tests and classes correctly, we shouldn't have to access private variables very often. For those rare cases when we need to, though, it's kind of nice when the language doesn't try to prevent us.</span>Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-34276893205514787552012-06-28T21:55:00.001-07:002012-06-28T21:55:35.664-07:00Unit Testing and Assumptions<br />
When we write a unit test, for better or worse, we're locking down a section of code. This is good when we want to make sure the logic doesn't change. This can be bad, however, when our tests make it difficult to make <i>good </i>changes. Maybe you need to fix a bug, or change the details of an algorithm.<br />
<br />
The problem is, sometimes unit tests make too many assumptions about the code under test. We assume that an algorithm will be implemented in a certain way. Perhaps the method contains a lot of side effects.<br />
<br />
In my mind, the best-case scenario involves feeding input to a function and getting something in return. I give you a sentence, and you capitalize every word for me. I don't care how you do it, just that the output is what I expect:<br />
<br />
<span style="font-family: 'Courier New', Courier, monospace;">def test_upper(self):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> input = "this is a sentence."</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> output = upper(input)</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> self.assertEqual("This Is A Sentence.", output)</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"><br /></span><br />
These tests are short, easy to write, and they make no assumptions about the underlying code. It turns out that the less side effects a method has, the easier it is to test.<br />
<br />
Cosider the following code:<br />
<br />
<span style="font-family: 'Courier New', Courier, monospace;">class box:</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> def __init__(self, length, width):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> self.__length = length</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> self.__width = width</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"><br /></span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> def compute_area(self):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> return self.__length * self.__width</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"><br /></span><br />
<span style="font-family: inherit;">The constructor just sets two private fields. This is difficult to test without either accessing the internals of the class or exposing the fields through getters. Even then, we would be testing implementation details that are subject to change. I would argue that we should test the constructor indirectly by testing </span><span style="font-family: 'Courier New', Courier, monospace;">compute_area</span><span style="font-family: inherit;"> as follows:</span><br />
<span style="font-family: inherit;"><br /></span><br />
<span style="font-family: 'Courier New', Courier, monospace;">def test_compute_area(self):</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> my_box = box(4, 5)</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> assertEqual(20, my_box.compute_area())</span><br />
<span style="font-family: inherit;"><br /></span><br />
<span style="font-family: inherit;">What we're really interested in is <i>not </i>that two private fields get set in the constructor, but that the object can compute its area.</span><br />Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-69951613346537771232012-06-23T10:41:00.001-07:002012-06-23T12:39:07.557-07:00My First Open Source ProjectIn my first post, I mentioned my <a href="http://bit-builder.blogspot.com/2012/06/virtues-of-weak-typing.html">first project</a> that I worked on in Python, called BatchEdit. I recently decided to <a href="https://github.com/jhewlett/BatchEdit">host</a> it on GitHub in order to motivate myself to work on it some more as well as to generate interest in the project.<br />
<br />
BatchEdit is a command-line batch photo editor. Basically, you give it an input and output folder and specifiy some adjustments to be done, such as resizing, sharpening, boosting contrast, adding a border, etc. Here is an example of what the command looks like to <span style="background-color: white;">auto-rotate,
increase contrast, convert to grayscale, resize to 720 pixels,
sharpen, add a gray border of 5 pixels, and overlay a watermark:</span><br />
<br />
<div style="margin-bottom: 0in;">
<span style="font-family: 'Courier New', Courier, monospace;">python
scripts\BatchEdit.zip --input C:\input --output C:\output</span><br />
<span style="font-family: 'Courier New', Courier, monospace;">--autorotate --resize 720 --grayscale --contrast 1.2 --sharpen 1.3</span><br />
<span style="font-family: 'Courier New', Courier, monospace;">--border 5,gray --watermark C:\logo_transparent.png</span></div>
<br />
<span style="background-color: white;">Here is what the resulting transformation looks like:</span><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<table>
<tbody>
<tr><td><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhC-Ax8ivo6ysKG8i3RBxkL54yFFjHjX1sSLva73ZM0cK1c0XsfwvVS5H_qQ51VNaAry3TyF77LAOOeC8KrpEBFy5XFQ5KefCnVzqCQn8ua8qEG0IyRUaIF-0tn0XOSj3f6Hfmuo-NQUl4/s1600/photo_orig.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="213" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhC-Ax8ivo6ysKG8i3RBxkL54yFFjHjX1sSLva73ZM0cK1c0XsfwvVS5H_qQ51VNaAry3TyF77LAOOeC8KrpEBFy5XFQ5KefCnVzqCQn8ua8qEG0IyRUaIF-0tn0XOSj3f6Hfmuo-NQUl4/s320/photo_orig.jpg" width="320" /></a>
</div>
</td><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1VvctCA8NzHlm-rw8z8Z7Aw5Cz84D4qWBzYS3jzNWXbHK70-wXc0eiBUleKSDZSf1uUAteJbhwIAloZ1GOYfKgMOTPoxXrxVgNTYUW5rk81zC3HWztj5KzCS5HBBCsHZ-QRsFGTK8M2w/s1600/photo_edit.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1VvctCA8NzHlm-rw8z8Z7Aw5Cz84D4qWBzYS3jzNWXbHK70-wXc0eiBUleKSDZSf1uUAteJbhwIAloZ1GOYfKgMOTPoxXrxVgNTYUW5rk81zC3HWztj5KzCS5HBBCsHZ-QRsFGTK8M2w/s320/photo_edit.jpg" width="214" /></a></td></tr>
</tbody></table>
<br />
In my mind, the most common use case for this program is preparing photos to upload to the web. Often photographers want to auto-rotate, resize, sharpen, and apply some basic adjustments such as boosting the contrast or saturation before uploading. If you have many photos to edit, doing this manually in a program like photoshop is quite tedious.
<br />
<br />
Under the hood, the code takes advantage of the <a href="http://www.pythonware.com/products/pil/">Python Imaging Library</a> (PIL) for all image manipulation.<br />
<br />
Currently there are a few shortcomings of the program. First, it requires that the end user have Python and PIL installed on his system. Second, it does not yet have a GUI on top of it. These limitations would probably keep all but the most expert users from using it.<br />
<br />
Batch photo editors have certainly been done before, but I think mine is simpler than most. <span style="background-color: white;">To see the source code, click </span><a href="https://github.com/jhewlett/BatchEdit" style="background-color: white;">here</a><span style="background-color: white;">.</span><br />
<br />Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-17759660533860005242012-06-21T21:33:00.002-07:002012-06-21T21:33:29.661-07:00Lean on the CompilerI've been thinking a lot lately about dynamic typing <span style="background-color: white;">–</span><span style="background-color: white;"> about how it supposedly sets you free from the bonds of the compiler and makes you 10x more productive. Some </span><a href="http://www.artima.com/weblogs/viewpost.jsp?thread=4639" style="background-color: white;">blog posts</a><span style="background-color: white;"> even go so far as to </span><span style="background-color: white;">claim </span><span style="background-color: white;">that dynamic typing will </span><i style="background-color: white;">replace</i><span style="background-color: white;"> strong typing due to the increase of unit testing. The advocates rightly point out that just because a program compiles does not mean it is correct. Unit tests are therefore a better safety net. Now, granted the code has full coverage, I agree that we can probably do without compile-time checking.</span><br />
<br />
But is а fully-covered code base a reality? Not likely. Especially when you're working with legacy code.<br />
<br />
Today at work I was doing quite a bit of signature changes to my methods <span style="background-color: white;">–</span><span style="background-color: white;"> adding and removing parameters to several methods in many different classes. In situations like this, I'm glad to have the compiler to make sure I don't miss anything, even before I have a chance to run what tests I do have. "Lean on the compiler" is the phrase Michael Feathers uses in </span><a href="http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052" style="background-color: white;">Working Effectively with Legacy Code</a><span style="background-color: white;">. Great read, by the way.</span><br />
<br />
Today I realized that this is one technique that I really miss in dynamically typed languages. Rather than holding me back, in this case the compile-time checking provided confidence to make changes quickly.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-23164742856318250162012-06-09T11:33:00.000-07:002012-06-09T11:33:57.513-07:00When TDD Breaks DownThere's been a lot of hype recently about Test-Driven Development (TDD). I really like it for the most part. It forces the programmer to think about <i>how</i> the code will be used. This will often result in a more easy-to-use API. It's also a good way to explore all of the edge cases and make sure they are covered with a unit test.<br />
<br />
I've found, though, that sometimes having to write the test <i>before</i> the code is a little too restrictive. Sometimes you have no idea what the resulting code will look like or how exactly it will achieve your goal. In this case, I am an advocate of what I like to call <i>exploratory coding</i>. Code first until you have an idea of what your code needs to do. This may be accompanied by some ad-hoc testing. Then, when you have the functionality nailed down, cover it with tests. Of course, this only works if you're careful to keep your code modular and testable. If not, you may need to refactor first or rewrite the code altogether following TDD.<br />
<br />
We often joke at work that the proper way to do TDD is to write the code first and comment it out, then get your failing test and fix it by uncommenting the code. This just goes to show how counter-intuitive test-first can be at times.<br />
<br />
I often hear people argue that if you write the code first, there's a chance you'll get lazy and won't get around to testing it afterwards. I disagree. This just requires a little discipline, which I would argue is essential to being a good developer in the first place.<br />
<br />
Essentially, the principle behind TDD is the importance of modular, well-tested code. Indeed, I would say that the rise of unit testing is one of the biggest developments to improve the overall quality of code in recent times.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0tag:blogger.com,1999:blog-2384549447454153727.post-86956017530977075052012-06-03T20:52:00.001-07:002013-02-05T19:11:18.538-08:00The Virtues of Dynamic TypingLast semester, I took a class in Python. Coming from a C++/C# background, many things were new to me. I liked not having to use curly braces. I also liked not having to declare the type of my variables.<br />
<br />
A few things caught me off guard, however. I found it strange that Python did not allow me to declare my class variables as private. I was also disappointed that I couldn't explicitly define an interface.<br />
<br />
What I initially considered limitations of the language, I eventually regarded as liberating. I felt empowered to be a responsible programmer without being babysat by the compiler. <i>I</i> knew what types I was using, and <i>I</i> knew which variables I intended to be private. When I wanted to use an interface, I simply used the same method signature in several different classes. Indeed, my productivity was increased significantly as a result of this flexibility.<br />
<br />
At the end of the class we had a chance to build a final project. In just three weeks, I was able to build a command-line batch photo editor. Working on it was a joy, and the resulting code was very clean and terse.<br />
<br />
My only frustration that did not have a satisfactory conclusion was the problem of deployment. Either the end user needs to have Python installed on his system, or you must somehow package up the interpreter with the application. Neither solution is ideal.<br />
<br />
Overall, I concluded that in many cases, static typing is an unnecessary burden to programmers. Dynamic typing is not without its problems, but Python will definitely hold a prominent place in my quiver from now on.Justin Hewletthttp://www.blogger.com/profile/08558889981804501934noreply@blogger.com0