DRY RoT
by joe

Posted on 2019-02-02



An exercise in which we perform a tuck-position triple somersault with a full twist off the 3-meter programmatic springboard.


Setting Up the Exercise

This blog currently consists of three discursive threads (BLOX, Utility, and 20 Years) and a few 'unthreaded' pages, including the Index and About pages and a development-context-only 'Scope' page.

A while back we were struck by the amount of duplication in the code that implements the Flask view functions for those three 'unthreaded' pages. We decided to exercise the DRY (Don't Repeat Yourself) principle to trim some of that duplication in order to simplify the implementation.

Near-Identical Twins

Let's start with the code for the index page. This blog_index() view function is pretty straightforward: 1) get the data for the index page; 2) return an error if it is not there; 3) set the time value shown at the bottom of each page to show when it was published; 4) collect and format the Rhyme and Stanza data for the page; and 5) pass all of the processed data to the template for the page.

The highlighting shows the few characters that differ between this code and the code for the about page.


@blg_bp.route('/')
def blog_index():
    db_result = huddle.get_index_spiel()

    if db_result is None:
        return f'No post found for: index.html'

    huddle.e_spiel.publish_date = str(datetime.utcnow())[0:19]

    segments = []
    for rhyme in huddle.e_spiel.rhymes:
        segments.append(rhyme.present())

    stanzas = []
    for stanza in huddle.e_spiel.stanzas:
        stanzas.append(stanza.present())

    return render_template('index.html',
                           spiel=huddle.e_spiel,
                           segments=segments,
                           stanzas=stanzas)

And here's the code for the about page - with similar highlighting.


@blg_bp.route('/about.html')
def blog_about():
    db_result = huddle.get_about_spiel()

    if db_result is None:
        return f'No post found for: about.html'

    huddle.e_spiel.publish_date = str(datetime.utcnow())[0:19]

    segments = []
    for rhyme in huddle.e_spiel.rhymes:
        segments.append(rhyme.present())

    stanzas = []
    for stanza in huddle.e_spiel.stanzas:
        stanzas.append(stanza.present())

    return render_template('about.html',
                           spiel=huddle.e_spiel,
                           segments=segments,
                           stanzas=stanzas)

The huddle.get_xxxxx_spiel() methods

We have a similar duplication in the two database retrieval methods in the Huddle() object. Shown below are the get_index_spiel() and get_about_spiel() methods.


def get_index_spiel(self):
    get_db()
    self.d_spiel = dbx_seed.db_read_index_spiel(g.db)
    if self.d_spiel is None:
        return None
    self.e_spiel = deepcopy(self.d_spiel)
    return True

def get_about_spiel(self):
    get_db()
    self.d_spiel = dbx_seed.db_read_about_spiel(g.db)
    if self.d_spiel is None:
        return None
    self.e_spiel = deepcopy(self.d_spiel)
    return True

The db_get_xxxxx_spiel() functions

The actual database access code is similarly structured, almost exactly the same for index as for about. Again, the differences are highlighted.


def db_read_index_spiel(db):
    c = db.cursor()
    sql = f'select * from spiel where thread="BLOX" and stencil="Home" '
    c.execute(sql)
    entry = c.fetchall()
    if not entry:
        return None
    db_spiel = entry[0]
    spiel = db_map.db_map_spiel(db_spiel)
    spiel.rhymes = []
    spiel.stanzas = []
    fill_stanzas(c, spiel)
    return spiel


def db_read_about_spiel(db):
    c = db.cursor()
    sql = f'select * from spiel where thread="BLOX" and stencil="About" '
    c.execute(sql)
    entry = c.fetchall()
    if not entry:
        return None
    db_spiel = entry[0]
    spiel = db_map.db_map_spiel(db_spiel)
    spiel.rhymes = []
    spiel.stanzas = []
    fill_stanzas(c, spiel)
    return spiel

Diving In

We started with the duplicated UI code in the Flask view functions and discovered that the same duplications flowed through the middleware and into the database access functions that support the UI. At each level we apparently have an option to exercise the DRY principle. At each level, in following the DRY process, we find little bits of code and data structures that need some tweaking in order to make the resulting consolidated code work well for the two cases combined in the result.

DRY Rotation #1 - Database Code

Our first DRY rotation is at the innermost database level. Shown below is the database-level consolidated code that retrieves the index or about spiel record as required. Note that, behind the scenes, we've added a "none" option to the possible thread values and reassigned the index and about spiels accordingly. (original code)

We also tidied up the mapping and filling of the Rhyme and Stanza contents associated with the Spiel - but we're going to let that ride for this post.


def db_read_unthreaded_spiel(db, stencil):
    c = db.cursor()
    sql = f'select * from spiel where thread="none" and stencil="{stencil}" '
    c.execute(sql)
    entry = c.fetchall()
    if not entry:
        return None
    db_spiel = entry[0]
    spiel = map_and_fill(c, db_spiel)
    return spiel
This change to a generalized function for retrieving a specified "unthreaded" spiel suggests similar changes to the middleware Huddle() that acts as the bridge between the database

DRY Rotation #2 - Huddle Code

Recall that the Huddle() methods get_index_spiel() and get_about_spiel() matched up one-to-one with the database functions db_read_index_spiel() and db_read_about_spiel(). It is little surprise, then, that the Huddle() methods refactor into a single method that mirrors the refactoring of the database code. (original code)


def get_unthreaded_spiel(self, stencil):
    get_db()
    self.d_spiel = dbx_seed.db_read_unthreaded_spiel(g.db, 
                                                     stencil
                                                     )
    if self.d_spiel is None:
        return None
    self.e_spiel = deepcopy(self.d_spiel)
    return True

Like the database refactoring before, this code consolidation looks to be a clean application of the DRY principle, producing simple, easily readable code without a lot of fuss.

DRY Rotation #3 - UI Code

With the database utility code taken care of, let's recall the Flask view functions that inspired today's exercise. (original code) . Not too hard to spot the extensive duplication of code here. And lots of opportunity to consolidate, applying the DRY principle.

We start with the Flask route decorator lines. Flask allows stacking route decorators, so it is straightforward to associate both routes with a single view function. There's no reduction in code here, but we are able to consolidate the two lines atop the blog_index() function.

Lacking variable values in the routing strings, we have to dig into the Flask request object to get information to identify which page we are handling. That lets us set the stencil and fname variables that we use to retrieve the correct page data from the database and to present the correct page through the render_template() return function.

The two database access methods in the Huddle() object have been combined into a single method with a parameter. And we can use the fname parameter to fill out the error return message.

That leaves only identifying the correct Jinja template in the return statement using the fname value again.

With this cornucopia of duplicated code staring us in the face, we embark on a DRY refactoring to simplify it all.

And this is what we came up with.


@blg_bp.route('/about.html')
@blg_bp.route('/')
def blog_index():
    if request.path == '/':
        stencil = 'Home'
        fname = 'index.html'
    else:
        stencil = 'About'
        fname = 'about.html'

    db_result = huddle.get_unthreaded_spiel(stencil)

    if db_result is None:
        return f'No post found for: {fname}'

    huddle.e_spiel.publish_date = str(datetime.utcnow())[0:19]

    segments = []
    for rhyme in huddle.e_spiel.rhymes:
        segments.append(rhyme.present())

    stanzas = []
    for stanza in huddle.e_spiel.stanzas:
        stanzas.append(stanza.present())

    return render_template(fname,
                           spiel=huddle.e_spiel,
                           segments=segments,
                           stanzas=stanzas
                           )

The 'if' statement at the beginning sets the values for stencil and fname that we need as parameters to distinguish the two cases for the various functions - the new function huddle.get_unthreaded_spiel() and the existing render_template() function.

The DRY refactoring of the blog_index() and blog_about() view functions into a single blog_index() function now seems complete.

DRY - 360° Full Twist

You may remember that we started this DRY exploration by identifying three 'unthreaded' pages. To this point we have discussed only two - the index page and the about page.

The code excerpt below shows the result of adding the 'Scope' page to the mix.


@blg_bp.route('/scope.html', methods=['GET', 'POST'])
@blg_bp.route('/about.html')
@blg_bp.route('/')
def blog_index():
    if request.path == '/':
        stencil = 'Home'
        fname = 'index.html'
    elif request.path == '/about.html':
        stencil = 'About'
        fname = 'about.html'
    else:
        stencil = 'Scope'
        fname = 'scope.html'

    db_result = huddle.get_unthreaded_spiel(stencil)

    if db_result is None:
        return f'No post found for: {fname}'

    if 'scope' in request.form:
        current_app.config['SCOPE'] = request.form['scope']

    scope_val = current_app.config['SCOPE']

    huddle.e_spiel.publish_date = str(datetime.utcnow())[0:19]

    segments = []
    for rhyme in huddle.e_spiel.rhymes:
        segments.append(rhyme.present())

    stanzas = []
    for stanza in huddle.e_spiel.stanzas:
        stanzas.append(stanza.present())

    return render_template(fname,
                           spiel=huddle.e_spiel,
                           segments=segments,
                           stanzas=stanzas,
                           scope_val=scope_val,
                           scope_options=scope_options
                           )

This is starting to make us uncomfortable. To be sure, the DRY-consolidated code behind this UI - the code that handles retrieval of the database information for these three unthreaded pages - is unaffected by adding the responsibility for the 'Scope' page to this single blog_index() view function. But it's beginning to look we've made a mistake somewhere.

And as it turns out, we've made at least two distinct mistakes.

Over-Rotated - The Wrong Abstraction

The first mistake is that reliance on these pages being 'unthreaded' is the wrong abstraction for linking them through this Flask view function.

Just look at the changes wrought with the addition of the 'Scope' page to the blog_index() view function. For starters, the Scope page is an HTTP POST request where the others are simple GET requests. That shows up both in the Flask route decorator and in the added need to check and set the current_app.config['SCOPE'] value - and the need to communicate the current value along with the range of available values in the parameters to the render_template() function. This is a considerable departure from the sleek combination of the 'Index' and 'About' pages that we saw earlier.

All of this added code and added parameters is telling us that we have fallen into a Wrong Abstraction hole. (See sidebars "The Wrong Abstraction" and "The Rule of Three".)

Fortunately, this is pretty simple for us to unwind. All we have to do is to re-separate the blog_scope() function from mistakenly consolidated code. As we observed earlier, the blog_scope() view function still works just fine with the DRY refactorings of the Huddle() methods and database functions.

... I Forgot!

The second mistake is that linking the index page and the about page with this single view function does not work in our specific environment.

Well, this is embarrassing.

A central feature of the BLOX project is that we use an active Flask application to compose and edit these blog posts - and then we use the Frozen-Flask add-on to produce static web pages that we publish to multiple Internet- and network-accessible locations.

Our DRY consolidation of the blog_index() and blog_about() view functions transgresses the bounds of Frozen-Flask, which is designed capture the results of invoking each specified Flask view function with its associated parameters. The problem that we have created is that our mechanism for distinguishing the invocation of blog_index() is hidden from Frozen-Flask inside the Flask application request object as the request.path value. As a result Flask-Frozen cannot supply Flask with the information needed to drive the creation of the separate pages that are supposed to be produced by the blog_index() view function.

Therefore, if we want the 'Index' and 'About' pages to be produced by Frozen-Flask, we have to find a way to feed Frozen-Flask the information it requires. The easiest way to accomplish is to go back to our original implementation with separate view functions for the 'Index' and 'About' pages.

Once we have done that, though, yet another DRY opportunity is revealed - and after we have taken advantage of that opportunity, things look pretty good.

Bonus - DRY en passant

Even though we decided that our initial impulses were misguided and put things back much the way they started, we did find a little bit of DRY sunshine in a chunk of replicated code that represents a common, repetitive process.

All of our existing stencils provide for inclusion of stanzas - and all but the ToC stencils make use of rhymes. Furthermore, we uniformly set a publish_date value whenever the blog page is generated.

So we abstracted out the following code for use in all of the view functions in the routes.py module.


def display_setup():

    huddle.e_spiel.publish_date = str(datetime.utcnow())[0:19]

    segments = []
    for rhyme in huddle.e_spiel.rhymes:
        segments.append(rhyme.present())

    stanzas = []
    for stanza in huddle.e_spiel.stanzas:
        stanzas.append(stanza.present())

    return segments, stanzas

Shown below is the result of using this abstraction for the blog_index() view function.


@blg_bp.route('/')
def blog_index():

    spiel, db_error = huddle.get_unthreaded_spiel('Home')

    if db_error is not None:
        flash(f'No post found for index.html.<br/>    {db_error}.')

    segments, stanzas = display_setup(spiel)

    return render_template('index.html',
                           spiel=spiel,
                           segments=segments,
                           stanzas=stanzas
                           )

Of course, the implementation of the blog_about() view function is almost identical. But now the duplication in those two functions is much easier to swallow. It's a relatively small amount of code - and the resulting functions drive the Frozen-Flask generation of static web pages correctly.

As a bonus, implementation of the blog_scope() view function benefits from the three separate DRY configurations as well. The net effect of our efforts is a real improvement in the code.


The score from the JoePython judges on the forward triple somersault with a full twist: 8.5


Just Skim This Post First
This is a very long post, certainly our longest to-date. But it is a pretty easy read. Even though there is a lot of code, you only need to note the general structure - and we have taken some pains to highlight the (simple) differences between paired files and the resulting code changes following our code consolidation efforts.

Feel free to just skim through it.
This is a BLOX post, but ...
Unlike our developmental diary posts, this post is intended to illustrate specific programming principles and practices. As a result, the code examples don't always match exactly with the code in any one commit - or any connected series of commits.
Before and After
Here are some links into the BLOX repository at Bitbucket showing exactly what the code looked like before and after the processes described in this post.

BEFORE: AFTER: U-Turn:
The Wrong Abstraction
Our characterization "Wrong Abstraction" comes from a blog post by Sandi Metz. In that post, she lays out a pattern of successive misguided attempts to exercise the DRY principle. Our experience described in this post closely mimics the extended scenario she presents in her post - except we make the entire series of mistakes all by ourself.
The Rule of Three
Our use of the "Rule of Three" is a cautionary brake on mindless or automatic application of the DRY principle. It suggests holding off on consolidating code until you have at least three repetitions of the code.

The actual description on Wikipedia states that "when similar code is used three times, it should be extracted into a new procedure." That "should" bothers us a bit, but we certainly approve of the qualification that "choosing an appropriate design to avoid duplication might benefit from more examples to see patterns".

And yes, the acronym "RoT" in the title of this post is derived from "Rule of Three".
U-Turns
The penultimate step in the refactoring exercise described in this post is a full reversal of one of the previous steps. In fact, we actually reverse the refactoring that inspired the exercise to start with.

This is a type of U-turn that is fairly common in our software development practice. We call attention to it because it exemplifies that writing software is actually mostly re-writing software. Once you recognize that as programmers we spend most of our time making and correcting mistakes - whether our own mistakes or someone else's - developing good habits to support that dynamic seems a lot more appealing.

You'll see this homily repeated many times throughout this blog:

Good decisions come from experience.
Experience comes from bad decisions.

Comments

It will be some time yet before we get a comments section working here. In the meantime feel free to send comments via email. On this site our name is Joe Python. The email address is our first name at joepython.com.

Edited: 2019-02-05 20:42:45(utc) Generated: 2019-06-10 17:29:58(utc)