Jump to content
jedimatt42

TIPI - TI-99/4A to Raspberry PI interface development

Recommended Posts

On 3/14/2020 at 11:01 AM, BeeryMiller said:

I did see this in the daemon.log file, but I have slept since, and I thought the connection issue hit after I left for work which would have been 6:30 EST, but it looks like the log file may be off an hour.

 

Mar 13 14:59:22 tipi3 systemd[1]: tipi.service: Main process exited, code=killed                          , status=15/TERM
Mar 13 14:59:22 tipi3 systemd[1]: tipi.service: Succeeded.
Mar 13 14:59:22 tipi3 systemd[1]: Stopped TI-99/4A DSR Service.
Mar 13 14:59:22 tipi3 systemd[1]: Started TI-99/4A DSR Service.
Mar 13 14:59:22 tipi3 systemd[1]: Stopping TI-99/4A DSR Service...
Mar 13 14:59:22 tipi3 systemd[1]: tipi.service: Main process exited, code=killed                          , status=15/TERM
Mar 13 14:59:22 tipi3 systemd[1]: tipi.service: Succeeded.
Mar 13 14:59:22 tipi3 systemd[1]: Stopped TI-99/4A DSR Service.
Mar 13 14:59:22 tipi3 systemd[1]: Started TI-99/4A DSR Service.
Mar 13 14:59:23 tipi3 systemd[1]: Stopping TI-99/4A DSR Service...
Mar 13 14:59:23 tipi3 systemd[1]: tipi.service: Main process exited, code=killed                          , status=15/TERM
Mar 13 14:59:23 tipi3 systemd[1]: tipi.service: Succeeded.
Mar 13 14:59:23 tipi3 systemd[1]: Stopped TI-99/4A DSR Service.
Mar 13 14:59:23 tipi3 systemd[1]: tipi.service: Start request repeated too quick                          ly.
Mar 13 14:59:23 tipi3 systemd[1]: tipi.service: Failed with result 'start-limit-                          hit'.
Mar 13 14:59:23 tipi3 systemd[1]: Failed to start TI-99/4A DSR Service.
Mar 13 14:59:23 tipi3 tipiwatchdog.sh[586]: Job for tipi.service failed.
Mar 13 14:59:23 tipi3 tipiwatchdog.sh[586]: See "systemctl status tipi.service"                           and "journalctl -xe" for details.
Mar 13 14:59:23 tipi3 systemd[1]: tipi.service: Start request repeated too quick                          ly.
Mar 13 14:59:23 tipi3 systemd[1]: tipi.service: Failed with result 'start-limit-                          hit'.
Mar 13 14:59:23 tipi3 systemd[1]: Failed to start TI-99/4A DSR Service.
Mar 13 14:59:23 tipi3 tipiwatchdog.sh[586]: Job for tipi.service failed.

 

daemon.log 418.78 kB · 1 download tipi.log 2.87 MB · 2 downloads tipi.log.1 4.88 MB · 1 download

 

So, at the same time as this, in the tipi.log, it looks like the service was being reset a bunch of times in a row... a little reset storm... there are no statements of error... just the watchdog service killing the tipi.service when the 4A sends the tipi-reset signal via cru. This usually only happens when a 4A resets to title screen. 

 

Do you @BeeryMiller have code that is automating this reset via cru that might have gone haywire? 

 

[email protected] 

 

Share this post


Link to post
Share on other sites
12 hours ago, jedimatt42 said:

 

So, at the same time as this, in the tipi.log, it looks like the service was being reset a bunch of times in a row... a little reset storm... there are no statements of error... just the watchdog service killing the tipi.service when the 4A sends the tipi-reset signal via cru. This usually only happens when a 4A resets to title screen. 

 

Do you @BeeryMiller have code that is automating this reset via cru that might have gone haywire? 

 

[email protected] 

 

 

I went through and reviewed my code.  When I first load and initialize the program, I do a reset of the TIPI.  Afterwards, the reset of the TIPI does not take place again and that initialization process is not run again.

 

When a user hangs up, or is disconnected for being idle (3 minutes as programmed), then the port is closed.  I then jump into a loop sending a message of >22,>00,>07 looking to accept a connection.  I copied your send/recv msg code from the DSR into my code as running the code from the DSR itself must have been a timing issue with the Geneve as it would hang up in MDOS mode.  I've been using that same code for the past year or so, so no reason to suspect the actual send/recv code.

 

Two thing come to my mind.  First, I can comment out the call that resets the TIPI.  I will do that in the next few minutes and see how things play out.  I think that code worked its way into the program when there was an issue back some time ago.  That may have been addressed with the 1.50 or 1.51 updates. 

 

The only other thing that comes to my mind, is if the TIPI Accept command is doing some kind of reset possibly somewhere of the TIPI DSR service????

 

Beery

 

Share this post


Link to post
Share on other sites
18 hours ago, jedimatt42 said:

 

So, at the same time as this, in the tipi.log, it looks like the service was being reset a bunch of times in a row... a little reset storm... there are no statements of error... just the watchdog service killing the tipi.service when the 4A sends the tipi-reset signal via cru. This usually only happens when a 4A resets to title screen. 

 

Do you @BeeryMiller have code that is automating this reset via cru that might have gone haywire? 

 

[email protected] 

 

I've got my code up and running.  So far, no issues after about 8 logins and timeouts.  Will know tomorrow to know how the app runs.


Beery

  • Like 3

Share this post


Link to post
Share on other sites

You'll find code in TIPI messaging called RESET, but that is handshaking code. Not signal level reset. The constant in messaging called RESET should be renamed HELLO. 

 

The *RESET signal that triggers what we see in the logs, of process restarts, is an actual wire from the CPLD to a PI GPIO input. I probably should have put an external pull-up resister on it, but there is an internal enabled at the receiving GPIO pin, inside the PI processor. The code that triggers this, in all the code I have written, is only in the 4A powerup routine of my DSR.

 

Maybe there was a compound issue? Do the logs show the same reset storm at the time of failure? 

 

[email protected]

Share this post


Link to post
Share on other sites
2 hours ago, jedimatt42 said:

You'll find code in TIPI messaging called RESET, but that is handshaking code. Not signal level reset. The constant in messaging called RESET should be renamed HELLO. 

 

The *RESET signal that triggers what we see in the logs, of process restarts, is an actual wire from the CPLD to a PI GPIO input. I probably should have put an external pull-up resister on it, but there is an internal enabled at the receiving GPIO pin, inside the PI processor. The code that triggers this, in all the code I have written, is only in the 4A powerup routine of my DSR.

 

Maybe there was a compound issue? Do the logs show the same reset storm at the time of failure? 

 

[email protected]

 

The *RESET signal you are referencing in the 4A Powerup code is what I borrowed from your DSR code back quite some time ago.  I do not recall the circumstances, but there was some reason I needed it, or thought I needed it, from some behavior I was experiencing.  If I recall correctly, I think I was using it to force a socket disconnect from another system.

 

I will need to do a reboot of the PI and repeat the run to see when I get the issue and can review the logs.  When I get more details, I will follow-up.

 

Beery

 

  • Like 3

Share this post


Link to post
Share on other sites

Matt,

 

I have attached two files.  After about 9 hours, is when access was lost.  In the tipi.log file which is at the very end, there is a break in time in the log file.  Around 3/24 at 06:18, I logged into the BBS to check it's status.  Later, from my workplace, I tried to connect and there was no connection.  In the daemon.log file, not sure what all that means after the 06:18 when I logged in and then out.

 

From tipi.log....

 

2020-03-24 06:18:59,737 TiSocket    : INFO     wrote 1 bytes to socket: 1
2020-03-24 06:18:59,741 TiSocket    : INFO     wrote 1 bytes to socket: 1
2020-03-24 06:18:59,744 TiSocket    : INFO     wrote 1 bytes to socket: 1
2020-03-24 06:18:59,748 TiSocket    : INFO     wrote 1 bytes to socket: 1
2020-03-24 06:18:59,751 TiSocket    : INFO     wrote 1 bytes to socket: 1
2020-03-24 06:18:59,755 TiSocket    : INFO     wrote 1 bytes to socket: 1
2020-03-24 06:18:59,759 TiSocket    : INFO     wrote 1 bytes to socket: 1
2020-03-24 06:18:59,762 TiSocket    : INFO     wrote 1 bytes to socket: 1
2020-03-24 06:18:59,766 TiSocket    : INFO     wrote 1 bytes to socket: 1
2020-03-24 06:18:59,769 TiSocket    : INFO     wrote 1 bytes to socket: 1
2020-03-24 06:18:59,773 TiSocket    : INFO     wrote 1 bytes to socket: 1
2020-03-24 06:18:59,827 TiSocket    : INFO     wrote 1 bytes to socket: 1
2020-03-24 06:19:01,219 TiSocket    : INFO     closing socket
2020-03-24 06:19:01,220 TiSocket    : INFO     closed socket: 1
2020-03-24 06:19:02,627 ClockFile   : INFO     clock mode:corcomp
2020-03-24 06:19:02,666 ClockFile   : INFO     close special? PI.CLOCK
2020-03-24 08:58:01,535 TiSocket    : INFO     connection socket given handleId 1
2020-03-24 08:58:01,543 ClockFile   : INFO     clock mode:corcomp

 

daemon.log tipi.log

Share this post


Link to post
Share on other sites

Matt,

 

Not sure if you did something special with the services, but with 1.54, I ran for 36 hours without a problem.  I'm running 1.55 now and things look good.

 

Beery

 

  • Like 1

Share this post


Link to post
Share on other sites
12 hours ago, BeeryMiller said:

Matt,

 

Not sure if you did something special with the services, but with 1.54, I ran for 36 hours without a problem.  I'm running 1.55 now and things look good.

 

Beery

 

 

I didn't do anything  :) I haven't seen anything in the logs to indicate that anything is actually wrong.. :(

 

I don't even have a good guess. 

 

If I was in your shoes, I'd be inclined to try things like closing the server socket and re-opening it after every hour of non-use... but that's not a solution to a root cause. 

 

[email protected]

Share this post


Link to post
Share on other sites
4 hours ago, jedimatt42 said:

 

I didn't do anything  :) I haven't seen anything in the logs to indicate that anything is actually wrong.. :(

 

I don't even have a good guess. 

 

If I was in your shoes, I'd be inclined to try things like closing the server socket and re-opening it after every hour of non-use... but that's not a solution to a root cause. 

 

[email protected]

Hmmm.  OK.  Has me wondering if an earlier update may have not went as expected, and the latest update self-corrected itself.  Just thinking out loud here.

 

Beery

Share this post


Link to post
Share on other sites
2 hours ago, BeeryMiller said:

Hmmm.  OK.  Has me wondering if an earlier update may have not went as expected, and the latest update self-corrected itself.  Just thinking out loud here.

 

Beery

Not likely. Most uodates don't do much except a 'git pull' and restart the services.  

 

For the service in question, you have restarted it many times.

 

It is more likely an intermittent network loss, or full log file tmpfs (ramdisk) 

 

You do single byte at a time messages, so it is possible, it creates a lot of logs... 

 

Logs are supposed to roll off, but the one time I tried to test that, I learned you shouldn't log at high speed to an SD-card. That is why logs to a tmpfs now.

 

I don't think I ever verified the log roll after that. 

 

Something I can look into.

 

[email protected]

  • Like 1

Share this post


Link to post
Share on other sites
On 4/7/2020 at 10:57 AM, jedimatt42 said:

Not likely. Most uodates don't do much except a 'git pull' and restart the services.  

 

For the service in question, you have restarted it many times.

 

It is more likely an intermittent network loss, or full log file tmpfs (ramdisk) 

 

You do single byte at a time messages, so it is possible, it creates a lot of logs... 

 

Logs are supposed to roll off, but the one time I tried to test that, I learned you shouldn't log at high speed to an SD-card. That is why logs to a tmpfs now.

 

I don't think I ever verified the log roll after that. 

 

Something I can look into.

 

[email protected]

 

Well, yesterday, after the system had been running for about 5 days, the BBS crashed where it was waiting for input from a socket.  This was about the time I had an internet connection failure.  I did capture the log files, just ran out of time yesterday with some other priorities to post things.  If you want to see the files, I will post this evening, otherwise, I won't tie up space here on the forum.

 

Not sure what size one can max out on the tmpfs if that is a setting you defined.  If it were a tmpfs issue, could a flash drive be plugged into the PI so that it avoids the SD card issue or do you think the logging feature would be just too much?  I guess I could ask is there a possibility to turn off some of the logging as a test if you think that is an avenue worth exploring?

 

Now, the intermittent network loss issue you also mention, I was having major problems yesterday with my AT&T service provider.  I've got a service call for them to be out today as there are line issues.  The network connection to the internet was up and down every 10 minutes.  Service tech had me pull the power on the router, and rebooted the router, plus they did some of their diagnostics on their end.  

 

Anyways, just some feedback.  I do realize my coding to use the TIPI as a BBS server is likely stress testing the system further than any other code at the moment.  If it turns out to not be feasible, then I will know.

 

Beery

Share this post


Link to post
Share on other sites
31 minutes ago, BeeryMiller said:

 

Well, yesterday, after the system had been running for about 5 days, the BBS crashed where it was waiting for input from a socket.  This was about the time I had an internet connection failure.  I did capture the log files, just ran out of time yesterday with some other priorities to post things.  If you want to see the files, I will post this evening, otherwise, I won't tie up space here on the forum.

 

Not sure what size one can max out on the tmpfs if that is a setting you defined.  If it were a tmpfs issue, could a flash drive be plugged into the PI so that it avoids the SD card issue or do you think the logging feature would be just too much?  I guess I could ask is there a possibility to turn off some of the logging as a test if you think that is an avenue worth exploring?

 

Now, the intermittent network loss issue you also mention, I was having major problems yesterday with my AT&T service provider.  I've got a service call for them to be out today as there are line issues.  The network connection to the internet was up and down every 10 minutes.  Service tech had me pull the power on the router, and rebooted the router, plus they did some of their diagnostics on their end.  

 

Anyways, just some feedback.  I do realize my coding to use the TIPI as a BBS server is likely stress testing the system further than any other code at the moment.  If it turns out to not be feasible, then I will know.

 

Beery

The correct thing to do is for me to make sure the logs never exceed the allocated space. It was either 30meg or 100meg... 

 

I will squeeze that in today or tomorrow.

 

[email protected]

  • Like 2

Share this post


Link to post
Share on other sites

I verified that the log rolling is working... as it is configured presently:

 

/var/log == 100 MB tmpfs

/var/log/tipi/tipi.log -> rolls at 5 MB, with a max of 5 backup files, so total max space consumed: 30 MB

 

The rest of the system logs are also on the system, but they appear to be using less than 10 meg in total, and log rolling for them is handled by the operating system. 

 

---- 

 

Things you could try : 

 

Ethernet... Use the Ethernet port on your PI instead of the Wifi... Ethernet doesn't go-away out from under software the way that WiFi does... 

Build a watchdog service... I imagine something like a dead-man switch for the BBS... a script running on the PI that if network access from the PI is lost, it drops a file in /home/tipi/tipi_disk/... that the 4A software can then periodically read, and perform appropriate resets... 

 

I will need more time ( a larger time window ) to setup network failure testing, and make the TIPI software more robust against intermittent failures for hosting server sockets.  No idea if I can succeed at this... or when.

 

[email protected]

  • Like 1

Share this post


Link to post
Share on other sites
1 hour ago, jedimatt42 said:

Ethernet... Use the Ethernet port on your PI instead of the Wifi... Ethernet doesn't go-away out from under software the way that WiFi does... 

Build a watchdog service... I imagine something like a dead-man switch for the BBS... a script running on the PI that if network access from the PI is lost, it drops a file in /home/tipi/tipi_disk/... that the 4A software can then periodically read, and perform appropriate resets... 

 

I will need more time ( a larger time window ) to setup network failure testing, and make the TIPI software more robust against intermittent failures for hosting server sockets.  No idea if I can succeed at this... or when.

 

[email protected]

Absolutely no problem Matt, and thanks for the suggestions.  Work on the things that matter to you.

 

It will be relatively easy to connect the sidecar PI to the ethernet connection.  As far as a dead-man switch for the BBS, I know what you are describing, but I will have to research things on how that could even be accomplished.  That's my problem to figure out.

 

If a monitor is plugged into the Raspberry PI 4+, does that have any impact on other services?  The reason I ask is that I plugged a HDMI monitor to it, but the monitor did not pick up any signal until the PI was rebooted when I was looking at things yesterday.  Didn't know if that would have slowed down the capabilities of the PI or not.

  • Like 1

Share this post


Link to post
Share on other sites

Shouldn't have done anything to the pi to plug in a monitor.. at least in my network I have three pis here and they all lose their Wi-Fi connection sooner or later and have to be restarted either restarting the Wi-Fi or restarting the pi completely. It could be my environment which is very RF saturated... Anyway when one of my pis is plugged in with ethernet it doesn't lose connection.. I wouldn't trust Wi-Fi connections or anything that you want to have reliable.

Sent from my LM-G820 using Tapatalk

  • Like 1

Share this post


Link to post
Share on other sites
30 minutes ago, arcadeshopper said:

Shouldn't have done anything to the pi to plug in a monitor.. at least in my network I have three pis here and they all lose their Wi-Fi connection sooner or later and have to be restarted either restarting the Wi-Fi or restarting the pi completely. It could be my environment which is very RF saturated... Anyway when one of my pis is plugged in with ethernet it doesn't lose connection.. I wouldn't trust Wi-Fi connections or anything that you want to have reliable.

Sent from my LM-G820 using Tapatalk
 

Good to know.  AT&T was out and replaced some connections, and I will switch from WiFi to Ethernet and restart the software.  He even left me an extra new router at the house at no charge.  I suspect my environment is very RF saturated as well.

 

Beery

 

  • Like 2

Share this post


Link to post
Share on other sites

Matt,

 

As a FYI, the ethernet cable did not solve the issue of the PI losing connection.  I was thinking about how to restart things on the PI with your watchdog.  At this point, I am not sure how this might be accomplished.  One thing I do know is that the Re(B)oot option of TIPICFG does fix the issue when it hits.  I don't see the source for TIPICFG on GitHub, so I am not able to extract the routine from it that accomplished that capability.  I assume it is not the same routine in the TIPI DSR that does the powerup.  Can you share that routine?  If so, I can have the PI periodically reboot if there is not a user connected.

 

I don't know if there is any correlation, but it actually seems the ethernet connection is having more issues than the WIFI connection.  I'm losing the Ethernet connection multiple times daily and I am pretty sure it dropped multiple times even though my internet connection did not drop during the time interval in question.

 

Beery

 

Share this post


Link to post
Share on other sites
25 minutes ago, BeeryMiller said:

Matt,

 

As a FYI, the ethernet cable did not solve the issue of the PI losing connection.  I was thinking about how to restart things on the PI with your watchdog.  At this point, I am not sure how this might be accomplished.  One thing I do know is that the Re(B)oot option of TIPICFG does fix the issue when it hits.  I don't see the source for TIPICFG on GitHub, so I am not able to extract the routine from it that accomplished that capability.  I assume it is not the same routine in the TIPI DSR that does the powerup.  Can you share that routine?  If so, I can have the PI periodically reboot if there is not a user connected.

 

I don't know if there is any correlation, but it actually seems the ethernet connection is having more issues than the WIFI connection.  I'm losing the Ethernet connection multiple times daily and I am pretty sure it dropped multiple times even though my internet connection did not drop during the time interval in question.

 

Beery

 

Source for tipicfg is in the TIPI repo, under clients/tipicfg.

 

But to reboot the PI from the TI, open for output, PI.REBOOT, and close the file.

 

To tell that the reboot is done, wait a little bit.. 10 seconds probably, then try to read a File... PI.STATUS if you like. If the reboot is not done it will block until it completed... 

 

[email protected]

  • Like 3

Share this post


Link to post
Share on other sites
8 hours ago, jedimatt42 said:

Source for tipicfg is in the TIPI repo, under clients/tipicfg.

 

But to reboot the PI from the TI, open for output, PI.REBOOT, and close the file.

 

To tell that the reboot is done, wait a little bit.. 10 seconds probably, then try to read a File... PI.STATUS if you like. If the reboot is not done it will block until it completed... 

 

[email protected]

Thanks.  I saw the open of the PI.REBOOT, but thought there was more to it than that with something missing.  Hopefully, I have time to code that tonight.  

 

Beery

 

 

Share this post


Link to post
Share on other sites

Going to post this as a reference for others that may be considering any server functionality on the PI/TIPI. 

 

This time, I had an appropriate set of keywords with Google of "raspberry pi detecting a loss of internet" and came up with this found at https://weworkweplay.com/play/rebooting-the-raspberry-pi-when-it-loses-wireless-connection-wifi/

 

This looks to be more simple to implement and does not require any TI-99/4A CPU cycles to do a PI.REBOOT, however as I think about it more, I don't think it is going to be perfect either "as is".  I think the CRON job on the PI is going to need to be a shorter interval of 5 minutes.  If timing hit just right, I think my router can re-establish an internet connection in < 5 minutes, so it may need to run once every minute.

 

I know if the BBS is in server mode waiting for a connection, one will not be found for some "cause" to create the incoming link to the port.  Right now, the presumption is it may be related to dropped internet connection.  If that is the case, then this CRON job will/should be sufficient.  I know I can open an outbound socket as TIPICFG does read some stats when it is loaded after the server side has decided to not respond.  

 

As the webpage above describes WIFI connectivity, I think what may be really needed is to ping an IP address somewhere else on the internet if running ethernet to the PI.

 

I should note that probably only someone running something in a server type environment should consider this CRON job for the TI-99/4A or Geneve.  Otherwise, someone using the TIPI predominantly for file storage may suddenly discover their PI is rebooting when they are in the middle of file access, etc.

 

Beery

 

  • Like 1

Share this post


Link to post
Share on other sites

I was going to say maybe rebooting the pi was overkill and just restarting the TIPI process was a better idea

Sent from my LM-G820 using Tapatalk

Share this post


Link to post
Share on other sites
2 hours ago, arcadeshopper said:

I was going to say maybe rebooting the pi was overkill and just restarting the TIPI process was a better idea

Sent from my LM-G820 using Tapatalk
 

Let's say I have this script running:

 

ping -c4 www.google.com > /dev/null

 

if [ $? != 0 ]

then

     sudo ?????????

fi

 

I am assuming I would use the sudo command, but what do I place after the sudo to restart the TIPI service?

 

I chose to use www.google.com as that is a website very unlikely to ever go down whereas the example I gave above with the link used a router ip address.  For others, I am anticipating to put that script into a crontab so that it runs every minute.

 

Beery

Edited by BeeryMiller

Share this post


Link to post
Share on other sites
5 hours ago, BeeryMiller said:

Let's say I have this script running:

 

ping -c4 www.google.com > /dev/null

 

if [ $? != 0 ]

then

     sudo ?????????

fi

 

I am assuming I would use the sudo command, but what do I place after the sudo to restart the TIPI service?

 

I chose to use www.google.com as that is a website very unlikely to ever go down whereas the example I gave above with the link used a router ip address.  For others, I am anticipating to put that script into a crontab so that it runs every minute.

 

Beery

guess that would be: systemctl restart tipi.service

 

To check the status: systemctl status tipi.service

 

Note: instead of cron you could also setup a systemd timer, but that is just a matter of taste I‘d say.

  • Like 1

Share this post


Link to post
Share on other sites

Thanks for the info.  The BBS right now is up and running, so when it loses the ability to pick up a connection, I am going to exit the program, and run telnet and login into the system to check status, then restart, check status, then reload the BBS program to confirm restarting the TIPI service did the trick.


Beery

  • Like 2

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...