Category: Coding

Confluent Kafka Queue Length

The documentation for the Python Confluent Kafka module includes a len function on the producer. I wanted to use the function because we’re getting a number of duplicated messages on the client, and I was trying to isolate what might be causing the problem. Unfortunately, calling producer.len() failed indicating there’s no len() method. I used dir(producer) to show that, no, there isn’t a len() method.

I realized today that the documentation is telling me that I can call the built-in len() function on a producer to get the queue length.

Code:

print(f"Before produce there are {len(producer)} messages awaiting delivery")
producer.produce(topic, key=bytes(str(int(cs.timestamp) ), 'utf8'), value=cs.SerializeToString() )
print(f"After produce there are {len(producer)} messages awaiting delivery")
producer.poll(0) # Per https://github.com/confluentinc/confluent-kafka-python/issues/16 for queue full error
print(f"After poll0 there are {len(producer)} messages awaiting delivery")

Output:

Before produce there are 160 messages awaiting delivery
After produce there are 161 messages awaiting delivery
After poll0 there are 155 messages awaiting delivery

Boolean Opts in Python

I have a few command line arguments on a Python script that are most readily used if they are boolean. I sometimes need a “verbose” option for script debugging — print a lot of extra stuff to show what’s going on, and I usually want a “dry run” option where the script reads data, performs calculations, and prints results to the screen without making any changes or sending data anywhere (database, email, etc). To use command line arguments as boolean values, I use a function that converts a variety of possible inputs to True/False.

def string2boolean(strInput):
    """
    :param strInput: String string to be converted to boolean
    :return: Boolean representation of input
    """
    if isinstance(strInput, bool):
        return strInput
    if strInput.lower() in ('yes', 'true', 't', 'y', '1'):
        return True
    elif strInput.lower() in ('no', 'false', 'f', 'n', '0'):
        return False
    else:
        raise argparse.ArgumentTypeError('Boolean value expected.')

Use “type” when adding the argument to run the input through your function.

    parser.add_argument('-r', '--dryrun', action='store', type=string2boolean, dest='boolDryRun', default=False, help="Preview data processing without sending data to DB or Kafka. Valid values: 'true' or 'false'.")

Oracle – Collections and IN or LIKE Queries

I’ve been retrofitting a lot of PHP/SQL queries to use oci_bind_by_name recently. When using “IN” clauses, you can iterate through your array twice, but it’s an inefficient approach.

// Build the query string with a bunch of placeholders
$strQuery = "select Col1, Col2, Col3 from TableName where ColName IN (";
for($i=0; $i < count($array); $i++){
    if($i > 0){
        $strQuery = $strQuery . ", ";
    }
    $strQuery = $strQuery . ":bindvar" . $i;
}
$strQuery = $strQuery . ")";
... 
// Then bind each placeholder to something
for($i=0; $i < count($array); $i++){
    oci_bind_by_name($stmt, ":bindvar".$i, $array[$i]);
}

Building a table from the array data and using an Oracle collection object creates cleaner code and avoids a second iteration of the array:

$strQuery = "SELECT indexID, objName FROM table WHERE objName in (SELECT column_value FROM table(:myIds))";
$stmt = oci_parse($conn, $strQuery);

$coll = oci_new_collection($kpiprd_conn, 'ODCIVARCHAR2LIST','SYS');
foreach ($arrayValues as $strValue) {
     $coll->append($strValue);
}
oci_bind_by_name($stmt, ':myIds', $coll, -1, OCI_B_NTY);
oci_set_prefetch($stmt, 300);
oci_execute($stmt);

A simple like clause is quite straight-forward

$strNameLikeString = "SomeName%";
$strQuery = "SELECT ds_dvrsty_set_nm from ds_dvrsty_set WHERE ds_dvrsty_set_nm LIKE :divsetnm ORDER BY ds_dvrsty_set_nm DESC fetch first 1 row only";

$stmt = oci_parse($connDB, $strQuery);
oci_bind_by_name($stmt, ":divsetnm", $strNameLikeString);
oci_set_prefetch($stmt, 300);
oci_execute($stmt);

But what about an array of inputs essentially reproducing the LIKE ANY predicate in PostgreSQL? There’s not a direct equivalent in Oracle, and iterating through the array twice to build out a query WHERE (Field1 LIKE ‘Thing1%’ OR Field1 LIKE ‘Thing2%’ OR Field1 LIKE ‘Thing3%’) is undesirable. The with EXISTS allows me to create a LIKE ANY type query and only iterate through my array once to bind variables to placeholders using the same collection approach as was used with the IN clause.

$arrayLocs = array('ERIEPAXE%', 'HNCKOHXA%', 'LTRKARXK%');
$strQuery = "SELECT location_id, clli_code FROM network_location WHERE EXISTS (select 1 FROM TABLE(:likelocs) WHERE clli_code LIKE column_value)";
$stmt = oci_parse($connDB, $strQuery);

$coll = oci_new_collection($connDB, 'ODCIVARCHAR2LIST','SYS');
foreach ($arrayLocs as $strLocation) {
    $coll->append($strLocation);
}
oci_bind_by_name($stmt, ':likelocs', $coll, -1, OCI_B_NTY);
oci_execute($stmt);
print "<table>\n";
print "<tr><th>Loc ID</th><th>CLLI</th></tr>\n";
while ($row = oci_fetch_array($stmt, OCI_ASSOC+OCI_RETURN_NULLS)) {
    print "<tr><td>" . $row['LOCATION_ID'] . "</td><td>" . $row['CLLI_CODE'] . "</td></tr>\n";
}
print "</table>\n";

There are many different collection types in Oracle which can be used with oci_new_collection. A full list of the system collection types can be queried from the database.

SELECT * FROM SYS.ALL_TYPES WHERE TYPECODE = 'COLLECTION' and OWNER = 'SYS';

JQuery – Finding a set of checkboxes

A corollary to my JavaScript modifying checkbox values when the box is checked or unchecked … I needed a way to reset the form (in my form, the default is for the boxes to be checked and the value to be 1). The following code identifies all checkboxes with a particular class, checks them, and sets the value to 1.

/**
 * This function checks off each checkbox of the input class
 *
 * @param {string} strCheckboxClass     Name of class identifying in-scope checkboxes
 * @return {null} 
 *
 * @example
 *
 *     checkAllDatabases ('MyBoxes');
 */
 function checkAllDatabases(strCheckboxClass){
    arrayCheckboxes = $('.'+strCheckboxClass);
    for(i = 0; i < arrayCheckboxes.length; i++) {
        $( '#'+arrayCheckboxes[i].name).prop( "checked", true );
        $( '#'+arrayCheckboxes[i].name).val(1);
    } 
}

Changing checkbox value when (un)checked

This bit of code handles another rather esoteric scenario — I have a generic “go to this URL and download the resultant Excel file” JavaScript function. This is because I write a lot of reporting tools and didn’t want to write a lot of code for each new tool. The template is an input form with a submit button that calls the generic function. Params for the elements on the form from which values are read, the URL to call to generate the report, and the POST elements into which each corresponding form value is inserted gets stuffed. Works great for text inputs. Works fine for drop-downs. But the value of a checkbox is really a combination of the potential value (from the value tag) and the checked state. That is — my Button 1 has a potential value of 1, but if the box is checked or not is really important.

Instead of attempting to determine the type of element in each form input so I can evaluate the checked condition, I decided to just change the value when the checkbox state is changed. Now Button 1 has a potential value of 0 when unchecked and a potential value of 1 when checked. I don’t need to know if the box is checked because the value answers that question. So passing along button1’s value to my URL lets the target site know if I want whatever Button 1 represents. (In this case, users are able to select from a list of seven data sources — smaller numbers of data sources reduce the query time but also fail to provide the most robust report).

The JavaScript to handle changing the checkbox value when the checked state changes:

$("#button1").change(function () {
    if ($("#button1").is(':checked')) {
        $("#button1").val(1);
    }
    else{
        $("#button1").val(0);
    }
});

$("#button2").change(function () {
    if ($("#button2").is(':checked')) {
        $("#button2").val(1);
    }
    else{
        $("#button2").val(0);
    }
});

The HTML defining these two checkboxes:

<input type="checkbox" id="button1" name="button1" value="1" checked><label for="ngmss">Thing 1</label>
<input type="checkbox" id="button2" name="button2" value="1" checked><label for="ngmss">Thing 2</label>

Google OAUTH Stuff

Reminder to self — when you set up a desktop app with OAUTH to use the Google APIs … you have to hit the authorization URL from the computer running the code. That means, for my calendar scraper, that I need to do X-redirection from the server & run the script. Firefox launches & the flow actually completes. Attempting to hit the URL from my computer yields a connection failure to the https://localhost:SomePort at the end of the workflow.

Move token.pickle to backup file, run getCalendarEvents.py with X-redirection so auth can be processed through web form.

Python: dir

I am writing this down because I never manage to remember these two super useful functions that tells you what a variable is.

iLastProcessedTimestamp = 0
with open(‘test.txt’) as f:
iLastProcessedTimestamp = int(f.readline())
print(dir(iLastProcessedTimestamp))
print(type(iLastProcessedTimestamp))

The type function tells you the variable’s class (in this case, int). The dir function tells you the attributes of the variable.

Pylint — Ignoring Errors

MS Word has an ‘ignore this error’ thing in the grammar checker that I use fairly regularly — technical writing has syntax that reads as wrong, grammatical errors for impact, or informal writing where I don’t much care for some rules of grammar … I don’t want to turn off the grammar checker, but I do want to stop seeing a squiggly line under a specific sentence that I don’t want to change. Turns out Pylint has something similar:

PIP SSL Error

Upgraded pip today, and I pretty quickly regretted it. SSL Error attempting to install anything from the Internet (and, amazingly, some things where I downloaded the wheel file). The answer is to downgrade PIP until you hit a version that doesn’t have the error. Annoying. Not sure what the latest rev I could have used was — going back one level and getting the error in loop was more time than I could devote to the project, so I just jumped back six months. Had success with 20.0.2 and left working alone.

Everything from 20.3.1 through 21.0.1 has this failure:

D:\tmp\5\pip>pip install basic_sftp
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘SSLError(SSLError(1, ‘[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1076)’))’: /simple/basic-sftp/
WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘SSLError(SSLError(1, ‘[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1076)’))’: /simple/basic-sftp/
WARNING: You are using pip version 20.3.1; however, version 21.0.1 is available.
You should consider upgrading via the ‘c:\programs\anaconda3\python.exe -m pip install –upgrade pip’ command.