Introducing Search
by joe

Posted on 2019-01-06



Sisters are doing it for themselves ...

As you will see, this little DIY exercise has a big payoff for relatively little effort.


Search Me

Eagle-eyed readers likely spotted our inclusion of a (very) simple search mechanism into this BLOX project just a few days before our graduation to the Sprout stage.

Not that it was all that hard to spot, given that the commit (9a93874) message was "Stub in rudimentary search capability".

Still, the exercise was significant because it was so easy. The core implementation code in the search4() function is only 16 lines. Leaving aside the crudity of the implementation expressed in those 16 lines, the function works marvelously. We wound up using the facility immediately. The most satisfying exercise was locating our misspellings of "Montreau" and correcting them to "Montreux".

Take what the defense gives you

The search implementation was a gift from Sqlite3 via the LIKE clause of the SELECT statement (line 16 of the original dbx_search.py module).

We've now had almost a week to think about how to handle some of the challenges of integrating this search capability into the existing database maintenance cum composing facility in BLOX. What we decided on is to integrate the single-phrase search option into the main screen as a fourth filter element, supplementing the existing filters for thread, target and stencil. This approach now seems rather elegant in light of the over-elaborate bolt-on implementation that we first considered. (See "First Impulses (Alas)" below - included for your entertainment.)

This approach looks so promising that it seems safe to announce our intentions ahead of the actual implementation so that you can watch the process in near real time.

Step 1: Configure main DB-MAINT page with Search
The first thing we did was to integrate the Search Phrase field into the main DB_MAINT page.

Here's what the top of the page looked like before the change:


The modified layout is shown below. In addition to expanding the filter settings to include a "Search Phrase" text box, we removed the vestigial "Del" buttons - because we now have a functioning delete option on the edit_spiel page.



While we're at it, we are also going to modify some of the dynamics of this show_spiels.html page. Specifically, we eliminate the multiple <form> tags and remove the onchange="this.form.submit()" instructions so that the page does not go back to the server, executing a retrieval of ALL of the spiels every time one of the thread, target, or stencil settings is changed. Instead, we will extend the "selection context" state that we keep in the Flask session dictionary, keeping a new session['phrase'] value to go along with the others. With that change, we'll now require that the user press the "Search" button to trigger a new round-trip to the database in order to repopulate the page with a list of spiels that meet the criteria set at the top of the page.

[Well, that change to require pressing the "Search" button lasted about ten minutes. We tired of that small, manual necessity after only a few experimental trials. We restored the attribute onchange="this.form.submit()" for all three dropdown boxes and for the text field for entering the search phrase. We now expect the instant-reaction behavior is here to stay. We're also eying that void under the "Add New Spiel" button for putting in date range selectors.]

Step 2: Set Selection-Search conditions in the huddle.py modules
We've been doing a bit of processing in the show_all_spiels() function of the spiel_views.py module that by all rights really does not belong in the UI code. We'll use this opportunity to clean that up a bit.

The code shown below retrieves all spiel records from the database and then filters them by constructing a list of spiels whose thread, target and stencil values pass the filter conditions.

The problem is that this code really should be implemented closer to the database interface code in the dbx_seed.py module.


    spiel_list = []
    all_spiels, db_error = huddle.get_all_spiels()
    if db_error is not None:
        flash(db_error)
    for spiel in all_spiels:
        if spiel.thread != session['thread'] and session['thread'] != 'all':
            continue
        if spiel.stencil != session['stencil'] and session['stencil'] != 'all':
            continue
        if session['target'] == 'public' and spiel.target != 'public':
            continue
        elif session['target'] == 'review' and spiel.target not in ['public', 'review']:
            continue
        elif session['target'] == 'internal' and spiel.target == 'local':
            continue
        spiel_list.append(spiel)
The first stage of this change is to pass the selection and filer criteria from the UI back to the Huddle object. There we implement a @staticmethod to format those criteria for use by the dbx_seed.py module to get exactly the set of spiels we specified in the selection screen.


    @staticmethod
    def get_selected_spiels(phrase=None, thread=None,
                            target=None, stencil=None):
        if phrase == '':
            phrase = None
        if thread == 'all':
            thread = None
        if stencil == 'all':
            stencil = None
        if target == 'local':
            target = None
        elif target == 'internal':
            target = 'target="internal" or target="review" or target="public"'
        elif target == 'review':
            target = 'target="review" or target="public"'
        elif target == 'public':
            target = 'target="public"'

        get_db()
        return dbx_seed.db_search_selected_spiels(g.db,
                                                  phrase,
                                                  thread,
                                                  target,
                                                  stencil,
                                                  )

We'll go ahead a warn you that this code may strike you as a little bit cringey because it constructs string values that will wind up as segments of Sqlite3 SELECT statements. The alternative, though, is to make the database interface code aware of the application-level significance of terms such as all or local or public.

We think this is best. After all, the Huddle object is intended to serve as intermediary between the UI and the database. This is where the "business logic" belongs, and the encroachment on database processing seems unavoidable.

Step 3: Select the exact set of spiels in the dbx_seed.py module
In order to effect the changes needed in the dbx_seed.py module, we have added the new function db_search_selected_spiels().

First, we construct and execute a SELECT statement to get a set of spiel IDs that we call spiel_sid_set that meet the three conditions specified by the thread, target, and stencil parameters.

Next, we construct and execute two other SELECT statements, each of which produces a set of spiel IDs. The first set (spiel_phrase_set) represents spiels whose titles contain the search phrase. The second set (stanza_phrase_set) represents spiels whose contents (part_2 of a Rhyme or eitherpart_1 or part_2 of a Stanza) include the search phrase.

In the most general case, the ultimate set of spiel IDs comes from taking the union of spiel_phrase_set and stanza_phrase_set and then the intersection of that union (phrase_set) with spiel_sid_set. That result is the sid_set.

In the case where no search phrase is entered, the sid_set is just the spiel_sid_set.

The spiel IDs in sid_set are plugged into a SELECT statement, and the results of the SELECT are the spiels to be shown on the selection page.

A couple of notes:

1. If there are no spiels that match all three of the thread, target, and stencil values specified, we know that the ultimate intersection will be empty, so we can exit right away. As a bonus, we can give a slightly more specific "error" message.

2. Constructing SQL statements this way is b-a-a-a-a-d. See the "Future Lessons" sidebar for some explanation.

Now we have a search capability. HO-HO-HO

We now have a basic search capability. And we like it.

There's a lot of upside still available, of course. For example, we've been laboring under the mistaken belief that HTML tags would necessarily squelch search phrases where some words in the phrase were modified with tags. But it turns out that the Sqlite3 wild card characters work when used in the "Search Phrase" text field.

First Impulses (Alas)

NOTE: a few days after drafting this section we flashed on the idea of just making the single-phrase search process a fourth filter on the main DB-MAINT page. Feel free to ridicule this impulse.

Introducing even the most primitive search leads immediately to several challenges to the existing implementation. The first occurs from the natural flow related to a search: 1) first, you want to display some reference to elements found via the search request; 2) then you want to explore or act on those found elements in some way; and 3) then you want to move on to the next task in your session.

In our first cut at implementing the search capability, we return a list of Spiels in which the search phrase occurs. So far, so good. As part of the presentation of the successful search results, we display the title of the Spiel and a button that offers us the opportunity to examine and edit the Spiel and its contents to our heart's content.

Then we hit a wall. Once we have examined (and edited) the first relevant Spiel, where do we want to go - and how do we get there?

Up to this point, our only path into the edit_spiel function has been through the show_spiels.html template. When we finished editing on a Spiel, we could go back to that entrance point via the "Show All" button on the edit_spiel.html template or via the DB-Maint menu selection.

Now, though, we might have come to the edit_spiel function from the search_results.html template. So where do we go when we are done examining and editing the first of several search results? We may well want to go back to the search results screen and select the next result for examination and editing - but we have no way to get there directly.

So, one option is to offer a way to get back to the search results screen directly. There are a couple of ways of doing that directly: 1) we could add a "Return to Search Results" button to the screen template; 2) we might do better to just turn our search results screen into an "edit_search_results" template instead of using the existing edit_spiel.html template, making some sort of large, single-screen page to handle all of the examinations and editing without going anywhere to do the editing; 3) ...

[Well, we dodged this bullet by rolling the "Search Phrase" filter into the same screen as the pre-existing filters, thus eliminating the need to account for different routes into the editing screens. But this prospect of preserving "how did we get here" state is sure to recur. Stay tuned.]

80/20
This first implementation of the search capability is a good example of riding the 80/20 Rule.

The data that makes up the text for our blog posts in the Sqlite3 CMS database is simple text for the most part, so the like clause of the Sqlite3 SELECT statement goes a very long way towards satisfying our needs. Add in the fact that the effort to program the functionality is minimal, and we have ourselves a big win here.

We will have ample opportunities to improve on this first foray into more powerful and refined filters. But first we want to explore these basic features.
Future Lessons
In our search capability implementation we construct a fair bit of SQL statement syntax in code. In particular, we take the contents of the "Search Phrase" text field and just stuff it into a SELECT statement LIKE clause without any examination of any kind. We also construct a parenthetical list of Spiel IDs and just stick the text into a SELECT statement IN clause.

We haven't given all that much thought to those actions, but we are generally aware of the phenomenon of SQL-Injection attacks. So, a bit of caution here: When you construct SQL statements, particularly when some of your construction material is derived from an uncontrolled source, you may be subject to a SQL-Injection attack.

See: Little Bobby Tables

At some later date we'll actually address this issue and resolve our actions - or correct them** as some part of a future lesson.

And that brings us to a significant qualification of this BLOX blogging exercise: Not all of this code is suitable for hardened, commercial use.**

One of our goals is to offer a fertile code base for instruction. And our belief is that the instruction will be much more meaningful if it shows how to improve code instead of showing only the end result. We hope to demonstrate a reasonable example of the process of creation, including the unavoidable dead ends, the wrong-headed thrusts of design or specification, the too-facile approaches that leave gaping holes (not just unaddressed edge cases) in the implementation - and all sorts of other maladies that afflict the lives of programmers.

Ours is not an antiseptic exercise.

**Once again, we draw on the scope and status of the BLOX application at this stage of development. This code is for our own use, executing on our personal computer - which is under our control. Heck, we are still running with the Flask built-in web server - a definite production no-no - to write these posts.

So, have no fear for us. We judge our risk to be non-existent. And the opportunity for instruction and education, which are primary drivers of this exercise, is worth whatever exposure we face.

Comments

It will be some time yet before we get a comments section working here. In the meantime feel free to send comments via email. On this site our name is Joe Python. The email address is our first name at joepython.com.

Edited: 2019-01-30 20:55:12(utc) Generated: 2019-06-10 17:29:58(utc)